I need to replace numbers in a dataframe column with a condition where no numbers less than 300 or greater than 900 exist. When I try to add a condition like that, I get true or false outputs and I need it done in numbers. This is what I have:
df= pd.DataFrame(data=distances) df parallax 0 567.170891 1 677.524304 2 422.738804 3 638.037667 4 9927.293477 ... ... 288 1142.040121 289 218.383978 290 506.344691 291 NaN 292 NaN
then
new_df=df.loc[df['parallax'] >= 300, 'parallax'] <= 900 new_df 0 True 1 True 2 True 3 True 4 False ... 284 False 285 False 287 False 288 False 290 True
Why does it return like that and how can I take out the False and have the numbers back? Is there a better way to do this?
It seems like you are currently using a boolean mask to filter your DataFrame, which results in a DataFrame of boolean values (True and False). If you want to filter the DataFrame based on the condition and keep only the rows that meet the condition, you can directly apply the condition to the DataFrame without using .loc. Here’s how you can do it:
True
False
.loc
import pandas as pd # Sample DataFrame data = {'parallax': [567.170891, 677.524304, 422.738804, 638.037667, 9927.293477, None, None]} df = pd.DataFrame(data) # Apply the condition to filter the DataFrame filtered_df = df[(df['parallax'] >= 300) & (df['parallax'] <= 900)] # Display the resulting DataFrame print(filtered_df)
This will output:
parallax 0 567.170891 1 677.524304 2 422.738804 3 638.037667
Now, filtered_df contains only the rows where the ‘parallax’ column meets the specified condition.
filtered_df
If you also want to replace the values that do not meet the condition with NaN, you can use numpy.where:
numpy.where
import numpy as np # Apply the condition and replace values outside the range with NaN df['parallax'] = np.where((df['parallax'] >= 300) & (df['parallax'] <= 900), df['parallax'], np.nan) # Display the modified DataFrame print(df)
parallax 0 567.170891 1 677.524304 2 422.738804 3 638.037667 4 NaN 5 NaN 6 NaN