小能豆

How to replace numbers given a condition without the output 'true' or 'false' back

python

I need to replace numbers in a dataframe column with a condition where no numbers less than 300 or greater than 900 exist. When I try to add a condition like that, I get true or false outputs and I need it done in numbers. This is what I have:

df= pd.DataFrame(data=distances)
df

parallax
0   567.170891
1   677.524304
2   422.738804
3   638.037667
4   9927.293477
... ...
288 1142.040121
289 218.383978
290 506.344691
291 NaN
292 NaN

then

new_df=df.loc[df['parallax'] >= 300, 'parallax'] <= 900
new_df

0       True
1       True
2       True
3       True
4      False
       ...  
284    False
285    False
287    False
288    False
290     True

Why does it return like that and how can I take out the False and have the numbers back? Is there a better way to do this?


阅读 61

收藏
2023-12-22

共1个答案

小能豆

It seems like you are currently using a boolean mask to filter your DataFrame, which results in a DataFrame of boolean values (True and False). If you want to filter the DataFrame based on the condition and keep only the rows that meet the condition, you can directly apply the condition to the DataFrame without using .loc. Here’s how you can do it:

import pandas as pd

# Sample DataFrame
data = {'parallax': [567.170891, 677.524304, 422.738804, 638.037667, 9927.293477, None, None]}
df = pd.DataFrame(data)

# Apply the condition to filter the DataFrame
filtered_df = df[(df['parallax'] >= 300) & (df['parallax'] <= 900)]

# Display the resulting DataFrame
print(filtered_df)

This will output:

    parallax
0  567.170891
1  677.524304
2  422.738804
3  638.037667

Now, filtered_df contains only the rows where the ‘parallax’ column meets the specified condition.

If you also want to replace the values that do not meet the condition with NaN, you can use numpy.where:

import numpy as np

# Apply the condition and replace values outside the range with NaN
df['parallax'] = np.where((df['parallax'] >= 300) & (df['parallax'] <= 900), df['parallax'], np.nan)

# Display the modified DataFrame
print(df)

This will output:

     parallax
0  567.170891
1  677.524304
2  422.738804
3  638.037667
4         NaN
5         NaN
6         NaN
2023-12-22