从CSV创建嵌套的JSON

一尘不染

从CSV创建嵌套的JSON

json

我已经阅读了从平面csv创建嵌套JSON的内容，但对我而言没有帮助。

我有一个很大的电子表格，它是由Google文档创建的，包含11行和74列（某些列未占用）。

我在Google云端硬盘上创建了一个示例。导出为a时，CSV它看起来像这样：

id,name,email,phone,picture01,picture02,picture03,status
1,Alice,alice@gmail.com,2131232,"image01_01
[this is an image]",image01_02,image01_03,single
2,Bob,bob@gmail.com,2854839,image02_01,"image02_02
[description to image 2]",,married
3,Frank,frank@gmail.com,987987,image03_01,image03_02,,single
4,Shawn,shawn@gmail.com,,image04_01,,,single

现在，我想要一个JSON结构，如下所示：

{
    "persons": [
        {
            "type": "config.profile",
            "id": "1",
            "email": "alice@gmail.com",
            "pictureId": "p01",
            "statusId": "s01"
        },
        {
            "type": "config.pictures",
            "id": "p01",
            "album": [
                {
                    "image": "image01_01",
                    "description": "this is an image"
                },
                {
                    "image": "image_01_02",
                    "description": ""
                },
                {
                    "image": "image_01_03",
                    "description": ""
                }
            ]
        },
        {
            "type": "config.status",
            "id": "s01",
            "status": "single"
        },
        {
            "type": "config.profile",
            "id": "2",
            "email": "bob@gmail.com",
            "pictureId": "p02",
            "statusId": "s02"
        },
        {
            "type": "config.pictures",
            "id": "p02",
            "album": [
                {
                    "image": "image02_01",
                    "description": ""
                },
                {
                    "image": "image_02_02",
                    "description": "description to image 2"
                }
            ]
        },
        {
            "type": "config.status",
            "id": "s02",
            "status": "married"
        }
    ]
}

以此类推。

我的理论方法是CSV逐行遍历文件（这是第一个问题：现在每一行等于一行，但有时是几行，因此我需要计算逗号？）。每行等于一个块config.profile中，包括id，email，pictureId，和statusId（正在生成后两者取决于行编号）。

然后，为每一行config.pictures生成id一个与插入到该行中的块相同的config.profile块。的album是一样多的元素的图片中给出的阵列。

最后，每一行都有一个config.status块，该块又与id中给出的块相同config.profile，并且其中一个条目status具有相应的状态。

我完全不知道如何创建嵌套和条件JSON文件。

我刚刚得到的地方，我转换的点CSV为有效JSON，没有任何嵌套和附加信息，不直接给定CSV，如type，pictureId，statusId，等。

任何帮助表示赞赏。如果使用另一种脚本语言（例如ruby）编程起来比较容易，我很乐意切换到那些语言。

在有人认为这是家庭作业之类的东西之前。它不是。我只想自动化原本非常繁琐的复制和粘贴任务。

阅读 338

2020-07-27

共1个答案

一尘不染

该csv模块将很好地处理CSV读数-包括处理引号内的换行符。

import csv
with open('my_csv.csv') as csv_file:
   for row in csv.reader(csv_file):
       # do work

该csv.reader对象是一个迭代器-
您可以使用循环来遍历CSV中的行for。每行都是一个列表，因此您可以将每个字段都以row[0]，row[1]等形式获取。请注意，这将加载第一行（在您的情况下仅包含字段名称）。

正如我们在第一行中给予我们的字段名，我们可以使用csv.DictReader使每一行的字段可以作为被访问row['id']，row['name']等等。这也将跳过第一行我们：

import csv
with open('my_csv.csv') as csv_file:
   for row in csv.DictReader(csv_file):
       # do work

对于JSON导出，请使用json模块。json.dumps()将采用Python数据结构（如列表和字典）并返回适当的JSON字符串：

import json
my_data = {'id': 123, 'name': 'Test User', 'emails': ['test@example.com', 'test@hotmail.com']}
my_data_json = json.dumps(my_data)

如果要完全按照发布的方式生成JSON输出，则可以执行以下操作：

output = {'persons': []}
with open('my_csv.csv') as csv_file:
    for person in csv.DictReader(csv_file):
        output['persons'].append({
            'type': 'config.profile',
            'id': person['id'],
            # ...add other fields (email etc) here...
        })

        # ...do similar for config.pictures, config.status, etc...

output_json = json.dumps(output)

output_json 将包含所需的JSON输出。

但是，我建议您仔细考虑要获取的JSON输出的结构-目前，您正在定义一个无用的外部字典，并且将所有“ config”数据直接添加到“
persons‘-您可能需要重新考虑这一点。

2020-07-27