一尘不染

Elasticsearch批量索引JSON数据

elasticsearch

我正在尝试将JSON文件批量索引到新的Elasticsearch索引中,但无法这样做。我在JSON中有以下示例数据

[{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"},
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"},
{"Amount": "2107", "Quantity": "3", "Id": "974920111", "Client_Store_sk": "1109"},
{"Amount": "2115", "Quantity": "2", "Id": "975463798", "Client_Store_sk": "1109"},
{"Amount": "2116", "Quantity": "1", "Id": "975463827", "Client_Store_sk": "1109"},
{"Amount": "648", "Quantity": "3", "Id": "975464139", "Client_Store_sk": "1109"},
{"Amount": "2126", "Quantity": "2", "Id": "975464805", "Client_Store_sk": "1109"},
{"Amount": "2133", "Quantity": "1", "Id": "975464061", "Client_Store_sk": "1109"},
{"Amount": "1339", "Quantity": "4", "Id": "974919458", "Client_Store_sk": "1109"},
{"Amount": "1196", "Quantity": "5", "Id": "974920538", "Client_Store_sk": "1109"},
{"Amount": "1198", "Quantity": "4", "Id": "975463638", "Client_Store_sk": "1109"},
{"Amount": "1345", "Quantity": "4", "Id": "974919522", "Client_Store_sk": "1109"},
{"Amount": "1347", "Quantity": "2", "Id": "974919563", "Client_Store_sk": "1109"},
{"Amount": "673", "Quantity": "2", "Id": "975464359", "Client_Store_sk": "1109"},
{"Amount": "2153", "Quantity": "1", "Id": "975464511", "Client_Store_sk": "1109"},
{"Amount": "3896", "Quantity": "4", "Id": "977289342", "Client_Store_sk": "1109"},
{"Amount": "3897", "Quantity": "4", "Id": "974920602", "Client_Store_sk": "1109"}]

我在用

 curl -XPOST localhost:9200/index_local/my_doc_type/_bulk --data-binary --data @/home/data1.json

当我尝试使用Elasticsearch的标准批量索引API时,出现此错误

 error: {"message":"ActionRequestValidationException[Validation Failed: 1: no requests added;]"}

任何人都可以帮助索引这种类型的JSON吗?


阅读 312

收藏
2020-06-22

共1个答案

一尘不染

您需要做的是读取该JSON文件,然后使用_bulk端点期望的格式构建一个批量请求,即,一行用于命令,一行用于文档,并用换行符分隔…冲洗并重复以下步骤每个文件:

curl -XPOST localhost:9200/your_index/_bulk -d '
{"index": {"_index": "your_index", "_type": "your_type", "_id": "975463711"}}
{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
{"index": {"_index": "your_index", "_type": "your_type", "_id": "975463943"}}
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}
... etc for all your documents
'

只要确保替换your_indexyour_type你正在使用的实际索引和类型名称。

更新

请注意,可以通过删除_index_type如果在URL中指定了命令行来缩短命令行。_id如果您在映射中指定id字段路径,也可以删除它(请注意,此功能在ES
2.0中已被弃用)。至少,{"index":{}}对于所有文档,命令行看起来都一样,但是对于指定要执行哪种操作(在本例中index为文档),它将始终是强制性的

更新2

curl -XPOST localhost:9200/index_local/my_doc_type/_bulk --data-binary  @/home/data1.json

/home/data1.json 应该看起来像这样:

{"index":{}}
{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
{"index":{}}
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}
{"index":{}}
{"Amount": "2107", "Quantity": "3", "Id": "974920111", "Client_Store_sk": "1109"}
2020-06-22