我用映射创建了一个新索引。其中存储了500 000个文档。
我想更改索引的映射,但是在elasticsearch中是不可能的。所以我用新的新映射创建了另一个索引,现在我正尝试将文档从旧索引复制到新索引。
我正在使用扫描和滚动类型从旧索引中检索文档并将其复制到新索引。复制需要花费更多时间,并且系统运行缓慢。
下面是我正在使用的代码。
$client= app('elastic_search'); $params = [ "search_type" => "scan", // use search_type=scan "scroll" => "30s", // how long between scroll requests. should be small! "size" => 500000, // how many results *per shard* you want back "index" => "admin_logs422", "body" => [ "query" => [ "match_all" => [] ] ] ]; $docs = $client->search($params); // Execute the search $scroll_id = $docs['_scroll_id']; while (\true) { // Execute a Scroll request $response = $client->scroll([ "scroll_id" => $scroll_id, //...using our previously obtained _scroll_id "scroll" => "500s" // and the same timeout window ] ); if (count($response['hits']['hits']) > 0) { foreach($response['hits']['hits'] as $s) { $params = [ 'index' => 'admin_logs421', 'type' => 'admin_type421', 'id'=> $s['_id'], 'client' => [ 'ignore' => [400, 404], 'verbose' => true, 'timeout' => 10, 'connect_timeout' => 10 ], 'body' => $s['_source'] ]; $response = app('elastic_search')->create($params); } $scroll_id = $response['_scroll_id']; } else { // No results, scroll cursor is empty. You've exported all the data return response("completed"); } }
您不必编写类似的代码。周围有一些出色的工具可以帮助您。
只需看看Taskrabbit的elasticdump实用程序,它就能完全满足您的需求。
elasticdump \ --input=http://localhost:9200/source_index \ --output=http://localhost:9200/target_index \ --type=data
最后,由于您使用的是Python,因此您还可以使用elasticsearch-py reindex实用程序