123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225 |
- [[docs-update-by-query]]
- == Update By Query API
- experimental[The update-by-query API is new and should still be considered experimental. The API may change in ways that are not backwards compatible]
- The simplest usage of `updateByQuery` updates each
- document in an index without changing the source. This usage enables
- <<picking-up-a-new-property,picking up a new property>> or another online
- mapping change.
- [source,java]
- --------------------------------------------------
- UpdateByQueryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
- updateByQuery.source("source_index").abortOnVersionConflict(false);
- BulkIndexByScrollResponse response = updateByQuery.get();
- --------------------------------------------------
- Calls to the `updateByQuery` API start by getting a snapshot of the index, indexing
- any documents found using the `internal` versioning.
- NOTE: Version conflicts happen when a document changes between the time of the
- snapshot and the time the index request processes.
- When the versions match, `updateByQuery` updates the document
- and increments the version number.
- All update and query failures cause `updateByQuery` to abort. These failures are
- available from the `BulkIndexByScrollResponse#getIndexingFailures` method. Any
- successful updates remain and are not rolled back. While the first failure
- causes the abort, the response contains all of the failures generated by the
- failed bulk request.
- To prevent version conflicts from causing `updateByQuery` to abort, set
- `abortOnVersionConflict(false)`. The first example does this because it is
- trying to pick up an online mapping change and a version conflict means that
- the conflicting document was updated between the start of the `updateByQuery`
- and the time when it attempted to update the document. This is fine because
- that update will have picked up the online mapping update.
- The `UpdateByQueryRequestBuilder` API supports filtering the updated documents,
- limiting the total number of documents to update, and updating documents
- with a script:
- [source,java]
- --------------------------------------------------
- UpdateByQueryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
- updateByQuery.source("source_index")
- .filter(termQuery("level", "awesome"))
- .size(1000)
- .script(new Script("ctx._source.awesome = 'absolutely'", ScriptType.INLINE, "painless", emptyMap()));
- BulkIndexByScrollResponse response = updateByQuery.get();
- --------------------------------------------------
- `UpdateByQueryRequestBuilder` also enables direct access to the query used
- to select the documents. You can use this access to change the default scroll size or
- otherwise modify the request for matching documents.
- [source,java]
- --------------------------------------------------
- UpdateByQueryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
- updateByQuery.source("source_index")
- .source().setSize(500);
- BulkIndexByScrollResponse response = updateByQuery.get();
- --------------------------------------------------
- You can also combine `size` with sorting to limit the documents updated:
- [source,java]
- --------------------------------------------------
- UpdateByQueryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
- updateByQuery.source("source_index").size(100)
- .source().addSort("cat", SortOrder.DESC);
- BulkIndexByScrollResponse response = updateByQuery.get();
- --------------------------------------------------
- In addition to changing the `_source` field for the document, you can use a
- script to change the action, similar to the Update API:
- [source,java]
- --------------------------------------------------
- UpdateByQueryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
- updateByQuery.source("source_index")
- .script(new Script(
- "if (ctx._source.awesome == 'absolutely) {"
- + " ctx.op='noop'
- + "} else if (ctx._source.awesome == 'lame') {"
- + " ctx.op='delete'"
- + "} else {"
- + "ctx._source.awesome = 'absolutely'}", ScriptType.INLINE, "painless", emptyMap()));
- BulkIndexByScrollResponse response = updateByQuery.get();
- --------------------------------------------------
- As in the <<docs-update,Update API>>, you can set the value of `ctx.op` to change the
- operation that executes:
- `noop`::
- Set `ctx.op = "noop"` if your script doesn't make any
- changes. The `updateByQuery` operaton then omits that document from the updates.
- This behavior increments the `noop` counter in the
- <<docs-update-by-query-response-body, response body>>.
- `delete`::
- Set `ctx.op = "delete"` if your script decides that the document must be
- deleted. The deletion will be reported in the `deleted` counter in the
- <<docs-update-by-query-response-body, response body>>.
- Setting `ctx.op` to any other value generates an error. Setting any
- other field in `ctx` generates an error.
- This API doesn't allow you to move the documents it touches, just modify their
- source. This is intentional! We've made no provisions for removing the document
- from its original location.
- You can also perform these operations on multiple indices and types at once, similar to the search API:
- [source,java]
- --------------------------------------------------
- UpdateByQueryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
- updateByQuery.source("foo", "bar").source().setTypes("a", "b");
- BulkIndexByScrollResponse response = updateByQuery.get();
- --------------------------------------------------
- If you provide a `routing` value then the process copies the routing value to the scroll query,
- limiting the process to the shards that match that routing value:
- [source,java]
- --------------------------------------------------
- UpdateByQueryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
- updateByQuery.source().setRouting("cat");
- BulkIndexByScrollResponse response = updateByQuery.get();
- --------------------------------------------------
- `updateByQuery` can also use the <<ingest>> feature by
- specifying a `pipeline` like this:
- [source,java]
- --------------------------------------------------
- UpdateByQueryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
- updateByQuery.setPipeline("hurray");
- BulkIndexByScrollResponse response = updateByQuery.get();
- --------------------------------------------------
- [float]
- [[docs-update-by-query-task-api]]
- === Works with the Task API
- You can fetch the status of all running update-by-query requests with the
- <<tasks,Task API>>:
- [source,java]
- --------------------------------------------------
- ListTasksResponse tasksList = client.admin().cluster().prepareListTasks()
- .setActions(UpdateByQueryAction.NAME).setDetailed(true).get();
- for (TaskInfo info: tasksList.getTasks()) {
- TaskId taskId = info.getTaskId();
- BulkByScrollTask.Status status = (BulkByScrollTask.Status) info.getStatus();
- // do stuff
- }
- --------------------------------------------------
- With the `TaskId` shown above you can look up the task directly:
- // provide API Example
- [source,java]
- --------------------------------------------------
- GetTaskResponse get = client.admin().cluster().prepareGetTask(taskId).get();
- --------------------------------------------------
- [float]
- [[docs-update-by-query-cancel-task-api]]
- === Works with the Cancel Task API
- Any Update By Query can be canceled using the <<tasks,Task Cancel API>>:
- [source,java]
- --------------------------------------------------
- // Cancel all update-by-query requests
- client.admin().cluster().prepareCancelTasks().setActions(UpdateByQueryAction.NAME).get().getTasks()
- // Cancel a specific update-by-query request
- client.admin().cluster().prepareCancelTasks().setTaskId(taskId).get().getTasks()
- --------------------------------------------------
- Use the `list tasks` API to find the value of `taskId`.
- Cancelling a request is typically a very fast process but can take up to a few seconds.
- The task status API continues to list the task until the cancellation is complete.
- [float]
- [[docs-update-by-query-rethrottle]]
- === Rethrottling
- Use the `_rethrottle` API to change the value of `requests_per_second` on a running update:
- [source,java]
- --------------------------------------------------
- RethrottleAction.INSTANCE.newRequestBuilder(client).setTaskId(taskId).setRequestsPerSecond(2.0f).get();
- --------------------------------------------------
- Use the `list tasks` API to find the value of `taskId`.
- As with the `updateByQuery` API, the value of `requests_per_second`
- can be any positive float value to set the level of the throttle, or `Float.POSITIVE_INFINITY` to disable throttling.
- A value of `requests_per_second` that speeds up the process takes
- effect immediately. `requests_per_second` values that slow the query take effect
- after completing the current batch in order to prevent scroll timeouts.
|