一尘不染

Pandas条件创建series/dataframe列

python pandas

我有下面的数据框:

    Type       Set
1    A          Z
2    B          Z           
3    B          X
4    C          Y

我想向数据框添加另一列(或生成一系列),该列与数据框的长度相同(= equal number of records/rows),如果Set =’Z’则将颜色设置为green ,如果Set = 'Z' and 'red' if Set = otherwise.

最好的方法是什么?


阅读 724

收藏
2020-02-04

共1个答案

一尘不染

如果你只有两种选择:

df['color'] = np.where(df['Set']=='Z', 'green', 'red')

例如,

import pandas as pd
import numpy as np

df = pd.DataFrame({'Type':list('ABBC'), 'Set':list('ZZXY')})
df['color'] = np.where(df['Set']=='Z', 'green', 'red')
print(df)

输出

  Set Type  color
0   Z    A  green
1   Z    B  green
2   X    B    red
3   Y    C    red

如果你有两个以上的条件,请使用np.select。例如,如果你想color成为

  • yellow when (df['Set'] == 'Z') & (df['Type'] == 'A')
  • otherwise blue when (df['Set'] == 'Z') & (df['Type'] == 'B')
  • otherwise purple when (df['Type'] == 'B')
  • otherwise black,

然后使用

df = pd.DataFrame({'Type':list('ABBC'), 'Set':list('ZZXY')})
conditions = [
    (df['Set'] == 'Z') & (df['Type'] == 'A'),
    (df['Set'] == 'Z') & (df['Type'] == 'B'),
    (df['Type'] == 'B')]
choices = ['yellow', 'blue', 'purple']
df['color'] = np.select(conditions, choices, default='black')
print(df)

输出:

  Set Type   color
0   Z    A  yellow
1   Z    B    blue
2   X    B  purple
3   Y    C   black
2020-02-04