我有一些文件:
{"name": "John", "district": 1}, {"name": "Mary", "district": 2}, {"name": "Nick", "district": 1}, {"name": "Bob", "district": 3}, {"name": "Kenny", "district": 1}
如何按地区过滤/选择不同的文档?
{"name": "John", "district": 1}, {"name": "Mary", "district": 2}, {"name": "Bob", "district": 3}
在SQL中,我可以使用GROUP BY。我尝试了条件聚合,但返回的计数却不同。
"aggs": { "distinct": { "terms": { "field": "district", "size": 0 } } }
感谢您的帮助!:-)
如果您的ElasticSearch版本为1.3或更高版本,则可以使用top_hits类型的子聚合,默认情况下,它将为您提供按查询分数排序的前三个匹配文档(此处为1,因为您使用match_all查询)。
您可以将size参数设置为3以上。
size
以下数据集和查询:
POST /test/districts/ {"name": "John", "district": 1} POST /test/districts/ {"name": "Mary", "district": 2} POST /test/districts/ {"name": "Nick", "district": 1} POST /test/districts/ {"name": "Bob", "district": 3} POST test/districts/_search { "size": 0, "aggs":{ "by_district":{ "terms": { "field": "district", "size": 0 }, "aggs": { "tops": { "top_hits": { "size": 10 } } } } } }
将以您想要的方式输出文档:
{ "took": 5, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 4, "max_score": 0, "hits": [] }, "aggregations": { "by_district": { "buckets": [ { "key": 1, "key_as_string": "1", "doc_count": 2, "tops": { "hits": { "total": 2, "max_score": 1, "hits": [ { "_index": "test", "_type": "districts", "_id": "XYHu4I-JQcOfLm3iWjTiOg", "_score": 1, "_source": { "name": "John", "district": 1 } }, { "_index": "test", "_type": "districts", "_id": "5dul2XMTRC2IpV_tKRRltA", "_score": 1, "_source": { "name": "Nick", "district": 1 } } ] } } }, { "key": 2, "key_as_string": "2", "doc_count": 1, "tops": { "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "test", "_type": "districts", "_id": "I-9Gd4OYSRuexhP1dCdQ-g", "_score": 1, "_source": { "name": "Mary", "district": 2 } } ] } } }, { "key": 3, "key_as_string": "3", "doc_count": 1, "tops": { "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "test", "_type": "districts", "_id": "bti2y-OUT3q2mBNhhI3xeA", "_score": 1, "_source": { "name": "Bob", "district": 3 } } ] } } } ] } } }