我有一个带有1000个json对象的json文件。有什么办法可以在每个json文档之前添加标题行?有没有最简单的方法?
示例:我有1000个这样的对象
{"id":58,"first_name":"Louis","last_name":"Jordan","email":"ljordan1l@nature.com","gender":"Male","Latitude":"-15.93444","Longitude":"-50.14028"}
我想为每个json对象添加如下所示的索引标头,以便可以在Elasticsearch Bulk api中使用
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "unique_id" } } {"id":58,"first_name":"Louis","last_name":"Jordan","email":"ljordan1l@nature.com","gender":"Male","Latitude":"-15.93444","Longitude":"-50.14028"}
如果您愿意利用Logstash,则无需修改文件,而可以简单地逐行读取文件,并使用elasticsearch利用Bulk API 的输出将其流式传输到ES 。
elasticsearch
将以下Logstash配置存储在一个名为的文件中es.conf(确保该文件path和ES hosts与您的设置匹配):
es.conf
path
hosts
input { file { path => "/path/to/your/json" sincedb_path => "/dev/null" start_position => "beginning" codec => "json" } } filter { mutate { remove_fields => ["@version", "@timestamp"] } } output { elasticsearch { hosts => "localhost:9200" index => "test" document_type => "type1" document_id => "%{id}" } }
然后,您需要安装logstash,并且将能够运行以下命令,以便将JSON文件加载到ES服务器:
bin/logstash -f es.conf