要获取的数据大小:大约20,000
问题:在python中使用以下命令搜索Elastic Search索引数据
但没有得到任何结果。
from pyelasticsearch import ElasticSearch es_repo = ElasticSearch(settings.ES_INDEX_URL) search_results = es_repo.search( query, index=advertiser_name, es_from=_from, size=_size)
如果我给的尺寸小于或等于10,000,则可以正常使用,但不能与20,000一起使用, 请帮助我找到最佳的解决方案。
PS:在深入研究ES时发现此消息错误:
结果窗口太大,从+大小必须小于或等于:[10000],但为[19999]。有关请求大数据集的更有效方法,请参见滚动API。
对于实时使用,最好的解决方案是在查询后使用搜索。您只需要一个日期字段,以及另一个唯一标识文档的_id字段- 一个字段或一个_uid字段就足够了。尝试类似的事情,在我的示例中,我想提取属于一个用户的所有文档-在我的示例中,用户字段具有keyword datatype:
_id
_uid
keyword datatype
from elasticsearch import Elasticsearch es = Elasticsearch() es_index = "your_index_name" documento = "your_doc_type" user = "Francesco Totti" body2 = { "query": { "term" : { "user" : user } } } res = es.count(index=es_index, doc_type=documento, body= body2) size = res['count'] body = { "size": 10, "query": { "term" : { "user" : user } }, "sort": [ {"date": "asc"}, {"_uid": "desc"} ] } result = es.search(index=es_index, doc_type=documento, body= body) bookmark = [result['hits']['hits'][-1]['sort'][0], str(result['hits']['hits'][-1]['sort'][1]) ] body1 = {"size": 10, "query": { "term" : { "user" : user } }, "search_after": bookmark, "sort": [ {"date": "asc"}, {"_uid": "desc"} ] } while len(result['hits']['hits']) < size: res =es.search(index=es_index, doc_type=documento, body= body1) for el in res['hits']['hits']: result['hits']['hits'].append( el ) bookmark = [res['hits']['hits'][-1]['sort'][0], str(result['hits']['hits'][-1]['sort'][1]) ] body1 = {"size": 10, "query": { "term" : { "user" : user } }, "search_after": bookmark, "sort": [ {"date": "asc"}, {"_uid": "desc"} ] }
然后,您将找到附加到resultvar的所有文档
result
如果您想在此处使用scroll query-doc :
scroll query
from elasticsearch import Elasticsearch, helpers es = Elasticsearch() es_index = "your_index_name" documento = "your_doc_type" user = "Francesco Totti" body = { "query": { "term" : { "user" : user } } } res = helpers.scan( client = es, scroll = '2m', query = body, index = es_index) for i in res: print(i)