一尘不染

熊猫在to_json时删除空值

json

我实际上有一个熊猫数据框,我想将其保存为json格式。从熊猫文档说:

注意NaN,NaT和None将被转换为null,并且datetime对象将根据date_format和date_unit参数进行转换

然后使用东方选项records 我有这样的事情

[{"A":1,"B":4,"C":7},{"A":null,"B":5,"C":null},{"A":3,"B":null,"C":null}]

是否可以有这个代替:

[{"A":1,"B":4,"C":7},{"B":5},{"A":3}]'

谢谢


阅读 207

收藏
2020-07-27

共1个答案

一尘不染

以下内容接近您想要的内容,从本质上讲,我们创建了非NaN值的列表,然后调用to_json它:

In [136]:
df.apply(lambda x: [x.dropna()], axis=1).to_json()

Out[136]:
'{"0":[{"a":1.0,"b":4.0,"c":7.0}],"1":[{"b":5.0}],"2":[{"a":3.0}]}'

在这里创建一个列表是必要的,否则它将尝试将结果与原始df形状对齐,这将重新引入NaN您要避免的值:

In [138]:
df.apply(lambda x: pd.Series(x.dropna()), axis=1).to_json()

Out[138]:
'{"a":{"0":1.0,"1":null,"2":3.0},"b":{"0":4.0,"1":5.0,"2":null},"c":{"0":7.0,"1":null,"2":null}}'

也调用list的结果dropna将以形状广播结果,如填充:

In [137]:
df.apply(lambda x: list(x.dropna()), axis=1).to_json()

Out[137]:
'{"a":{"0":1.0,"1":5.0,"2":3.0},"b":{"0":4.0,"1":5.0,"2":3.0},"c":{"0":7.0,"1":5.0,"2":3.0}}'
2020-07-27