一尘不染

使用logstash将CSV地理数据作为geo_point类型输入到elasticsearch中

elasticsearch

以下是我正在使用的最新版本的logstash和elasticsearch的问题的可复制示例。

我正在使用logstash将来自csv的地理空间数据作为geo_points输入到elasticsearch。

CSV如下所示:

$ head simple_base_map.csv 
"lon","lat"
-1.7841,50.7408
-1.7841,50.7408
-1.78411,50.7408
-1.78412,50.7408
-1.78413,50.7408
-1.78414,50.7408
-1.78415,50.7408
-1.78416,50.7408
-1.78416,50.7408

我创建了一个映射模板,如下所示:

$ cat simple_base_map_template.json 
{
  "template": "base_map_template",
  "order":    1,
  "settings": {
    "number_of_shards": 1
  },

      "mappings": {
        "node_points" : {
          "properties" : {
            "location" : { "type" : "geo_point" }
          }
        }
      }
}

并有一个logstash配置文件,如下所示:

$ cat simple_base_map.conf 
input {
  stdin {}
}

filter {
  csv {
      columns => [
        "lon", "lat"
      ]
  }

  if [lon] == "lon" {
      drop { }
  } else {
      mutate {
          remove_field => [ "message", "host", "@timestamp", "@version"     ]
      }
       mutate {
          convert => { "lon" => "float" }
          convert => { "lat" => "float" }
          }

      mutate {
          rename => {
              "lon" => "[location][lon]"
              "lat" => "[location][lat]"
          }
      }
  }
}

output {
  stdout { codec => dots }
  elasticsearch {
      index => "base_map_simple"
      template => "simple_base_map_template.json"
      document_type => "node_points"
  }
}

然后运行以下命令:

$cat simple_base_map.csv | logstash-2.1.3/bin/logstash -f simple_base_map.conf 
Settings: Default filter workers: 16
Logstash startup completed
....................................................................................................Logstash shutdown completed

但是,当查看索引base_map_simple时,它表明文档中没有位置:geo_point类型…而应该是lat和lon的两倍。

$ curl -XGET 'localhost:9200/base_map_simple?pretty'
{
  "base_map_simple" : {
    "aliases" : { },
    "mappings" : {
      "node_points" : {
        "properties" : {
          "location" : {
            "properties" : {
              "lat" : {
                "type" : "double"
              },
              "lon" : {
                "type" : "double"
              }
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1457355015883",
        "uuid" : "luWGyfB3ToKTObSrbBbcbw",
        "number_of_replicas" : "1",
        "number_of_shards" : "5",
        "version" : {
          "created" : "2020099"
        }
      }
    },
    "warmers" : { }
  }
}

我将如何更改上述任何文件,以确保它作为geo_point类型进入elasticsearch?

最后,我希望能够使用以下命令在geo_points上进行最近邻居搜索:

curl -XGET 'localhost:9200/base_map_simple/_search?pretty' -d'
{
    "size": 1,
    "sort": {
   "_geo_distance" : {
       "location" : {
            "lat" : 50,
            "lon" : -1
        },
        "order" : "asc",
        "unit": "m"
   } 
    }
}'

谢谢


阅读 566

收藏
2020-06-22

共1个答案

一尘不染

问题在于,在elasticsearch输出中您为索引命名,base_map_simple而在模板中该template属性为base_map_template,因此在创建新索引时不会应用该模板。该template属性需要以某种方式匹配要创建的索引的名称,以使模板生效。

如果将后者简单地更改为,它将起作用base_map_*,例如:

{
  "template": "base_map_*",             <--- change this
  "order": 1,
  "settings": {
    "index.number_of_shards": 1
  },
  "mappings": {
    "node_points": {
      "properties": {
        "location": {
          "type": "geo_point"
        }
      }
    }
  }
}

更新

确保首先删除当前索引以及模板。

curl -XDELETE localhost:9200/base_map_simple
curl -XDELETE localhost:9200/_template/logstash
2020-06-22