ElasticSearch-返回查询的构面的完整值

一尘不染

ElasticSearch-返回查询的构面的完整值

elasticsearch

我最近开始使用ElasticSearch。我尝试完成一些用例。我对其中一个有问题。

我已经用他们的全名为一些用户建立了索引（例如“ Jean-Paul Gautier”，“ Jean De La Fontaine”）。

我尝试让所有全名响应某个查询。

例如，我希望以“ J”开头的100个最全名

{
  "query": {
    "query_string" : { "query": "full_name:J*" } }
  },
  "facets":{
    "name":{
      "terms":{
        "field": "full_name",
        "size":100
      }
    }
  }
}

我得到的结果是全名的所有单词：“ Jean”，“ Paul”，“ Gautier”，“ De”，“ La”，“ Fontaine”。

如何获得“ Jean-Paul Gautier”和“ Jean De La Fontaine”（所有全名值以“ J”开头）？“
post_filter”选项不执行此操作，它仅限制此子集。

我必须配置full_name方面的“工作方式”
我必须为此当前查询添加一些选项
我必须做一些“映射”（暂时还不清楚）

谢谢

阅读 258

2020-06-22

共1个答案

一尘不染

您只需要"index": "not_analyzed"在字段上进行设置，就可以在构面中获取完整的，未修改的字段值。

通常，最好有一个未分析的字段版本（用于分面），而另一个未分析的字段（用于搜索）。该"multi_field"字段类型是对这项有益的。

因此，在这种情况下，我可以如下定义映射：

curl -XPUT "http://localhost:9200/test_index/" -d'
{
   "mappings": {
      "people": {
         "properties": {
            "full_name": {
               "type": "multi_field",
               "fields": {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "full_name": {
                     "type": "string"
                  }
               }
            }
         }
      }
   }
}'

在这里，我们有两个子字段。默认名称与父名称相同。因此，如果您针对该"full_name"字段进行搜索，Elasticsearch将实际使用"full_name.full_name"。"full_name.untouched"将为您提供想要的方面结果。

因此，接下来我添加两个文档：

curl -XPUT "http://localhost:9200/test_index/people/1" -d'
{
   "full_name": "Jean-Paul Gautier"
}'

curl -XPUT "http://localhost:9200/test_index/people/2" -d'
{
   "full_name": "Jean De La Fontaine"
}'

然后，我可以在每个字段上查看返回的结果：

curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
   "size": 0,
   "facets": {
      "name_terms": {
         "terms": {
            "field": "full_name"
         }
      },
      "name_untouched": {
         "terms": {
            "field": "full_name.untouched",
            "size": 100
         }
      }
   }
}'

我得到以下信息：

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 0,
      "hits": []
   },
   "facets": {
      "name_terms": {
         "_type": "terms",
         "missing": 0,
         "total": 7,
         "other": 0,
         "terms": [
            {
               "term": "jean",
               "count": 2
            },
            {
               "term": "paul",
               "count": 1
            },
            {
               "term": "la",
               "count": 1
            },
            {
               "term": "gautier",
               "count": 1
            },
            {
               "term": "fontaine",
               "count": 1
            },
            {
               "term": "de",
               "count": 1
            }
         ]
      },
      "name_untouched": {
         "_type": "terms",
         "missing": 0,
         "total": 2,
         "other": 0,
         "terms": [
            {
               "term": "Jean-Paul Gautier",
               "count": 1
            },
            {
               "term": "Jean De La Fontaine",
               "count": 1
            }
         ]
      }
   }
}

如您所见，分析字段返回单个单词的小写标记（当您不指定分析器时，将使用标准分析器），而未分析的子字段将返回未修改的原始文本。

这是您可以使用的可运行示例：http
:
//sense.qbox.io/gist/7abc063e2611846011dd874648fd1b77450b19a5

2020-06-22