|
@@ -0,0 +1,358 @@
|
|
|
+[role="xpack"]
|
|
|
+[[data-stream-reindex-api]]
|
|
|
+=== Reindex data stream API
|
|
|
+++++
|
|
|
+<titleabbrev>Reindex data stream</titleabbrev>
|
|
|
+++++
|
|
|
+
|
|
|
+.New API reference
|
|
|
+[sidebar]
|
|
|
+--
|
|
|
+For the most up-to-date API details, refer to {api-es}/group/endpoint-migration[Migration APIs].
|
|
|
+--
|
|
|
+
|
|
|
+include::{es-ref-dir}/migration/apis/shared-migration-apis-tip.asciidoc[]
|
|
|
+
|
|
|
+The reindex data stream API is used to upgrade the backing indices of a data stream to the most
|
|
|
+recent major version. It works by reindexing each backing index into a new index, then replacing the original
|
|
|
+backing index with its replacement and deleting the original backing index. The settings and mappings
|
|
|
+from the original backing indices are copied to the resulting backing indices.
|
|
|
+
|
|
|
+This api runs in the background because reindexing all indices in a large data stream
|
|
|
+is expected to take a large amount of time and resources. The endpoint will return immediately and a persistent
|
|
|
+task will be created to run in the background. The current status of the task can be checked with
|
|
|
+the <<data-stream-reindex-status-api,reindex status API>>. This status will be available for 24 hours after the task completes, whether
|
|
|
+it finished successfully or failed. If the status is still available for a task, the task must be cancelled before it can be re-run.
|
|
|
+A running or recently completed data stream reindex task can be cancelled using the <<data-stream-reindex-cancel-api,reindex cancel API>>.
|
|
|
+
|
|
|
+///////////////////////////////////////////////////////////
|
|
|
+[source,console]
|
|
|
+------------------------------------------------------
|
|
|
+POST /_migration/reindex/my-data-stream/_cancel
|
|
|
+DELETE _data_stream/my-data-stream
|
|
|
+DELETE _index_template/my-data-stream-template
|
|
|
+------------------------------------------------------
|
|
|
+// TEARDOWN
|
|
|
+///////////////////////////////////////////////////////////
|
|
|
+
|
|
|
+
|
|
|
+[[data-stream-reindex-api-request]]
|
|
|
+==== {api-request-title}
|
|
|
+
|
|
|
+`POST /_migration/reindex`
|
|
|
+
|
|
|
+
|
|
|
+[[data-stream-reindex-api-prereqs]]
|
|
|
+==== {api-prereq-title}
|
|
|
+
|
|
|
+* If the {es} {security-features} are enabled, you must have the `manage`
|
|
|
+<<privileges-list-indices,index privilege>> for the data stream.
|
|
|
+
|
|
|
+[[data-stream-reindex-body]]
|
|
|
+==== {api-request-body-title}
|
|
|
+
|
|
|
+`source`::
|
|
|
+`index`:::
|
|
|
+(Required, string) The name of the data stream to upgrade.
|
|
|
+
|
|
|
+`mode`::
|
|
|
+(Required, enum) Set to `upgrade` to upgrade the data stream in-place, using the same source and destination
|
|
|
+data stream. Each out-of-date backing index will be reindexed. Then the new backing index is swapped into the data stream and the old index is deleted.
|
|
|
+Currently, the only allowed value for this parameter is `upgrade`.
|
|
|
+
|
|
|
+[[reindex-data-stream-api-settings]]
|
|
|
+==== Settings
|
|
|
+
|
|
|
+You can use the following settings to control the behavior of the reindex data stream API:
|
|
|
+
|
|
|
+[[migrate_max_concurrent_indices_reindexed_per_data_stream-setting]]
|
|
|
+// tag::migrate_max_concurrent_indices_reindexed_per_data_stream-setting-tag[]
|
|
|
+`migrate.max_concurrent_indices_reindexed_per_data_stream`
|
|
|
+(<<dynamic-cluster-setting,Dynamic>>)
|
|
|
+The number of backing indices within a given data stream which will be reindexed concurrently. Defaults to `1`.
|
|
|
+// end::migrate_max_concurrent_indices_reindexed_per_data_stream-tag[]
|
|
|
+
|
|
|
+[[migrate_data_stream_reindex_max_request_per_second-setting]]
|
|
|
+// tag::migrate_data_stream_reindex_max_request_per_second-setting-tag[]
|
|
|
+`migrate.data_stream_reindex_max_request_per_second`
|
|
|
+(<<dynamic-cluster-setting,Dynamic>>)
|
|
|
+The average maximum number of documents within a given backing index to reindex per second.
|
|
|
+Defaults to `1000`, though can be any decimal number greater than `0`.
|
|
|
+To remove throttling, set to `-1`.
|
|
|
+This setting can be used to throttle the reindex process and manage resource usage.
|
|
|
+Consult the <<docs-reindex-throttle,reindex throttle docs>> for more information.
|
|
|
+// end::migrate_data_stream_reindex_max_request_per_second-tag[]
|
|
|
+
|
|
|
+
|
|
|
+[[reindex-data-stream-api-example]]
|
|
|
+==== {api-examples-title}
|
|
|
+
|
|
|
+Assume we have a data stream `my-data-stream` with the following backing indices, all of which have index major version 7.x.
|
|
|
+
|
|
|
+* .ds-my-data-stream-2025.01.23-000001
|
|
|
+* .ds-my-data-stream-2025.01.23-000002
|
|
|
+* .ds-my-data-stream-2025.01.23-000003
|
|
|
+
|
|
|
+Let's also assume that `.ds-my-data-stream-2025.01.23-000003` is the write index.
|
|
|
+If {es} is version 8.x and we wish to upgrade to major version 9.x, the version 7.x indices must be upgraded in preparation.
|
|
|
+We can use this API to reindex a data stream with version 7.x backing indices and make them version 8 backing indices.
|
|
|
+
|
|
|
+Start by calling the API:
|
|
|
+
|
|
|
+[[reindex-data-stream-start]]
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+POST _migration/reindex
|
|
|
+{
|
|
|
+ "source": {
|
|
|
+ "index": "my-data-stream"
|
|
|
+ },
|
|
|
+ "mode": "upgrade"
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[setup:my_data_stream]
|
|
|
+
|
|
|
+
|
|
|
+As this task runs in the background this API will return immediately.
|
|
|
+The task will do the following.
|
|
|
+
|
|
|
+First, the data stream is rolled over. So that no documents are lost during the reindex, we add <<index-block-settings,write blocks>>
|
|
|
+ to the existing backing indices before reindexing them. Since a data stream's write index cannot have a write block,
|
|
|
+ the data stream is must be rolled over. This will produce a new write index, `.ds-my-data-stream-2025.01.23-000004`; which
|
|
|
+ has an 8.x version and thus does not need to be upgraded.
|
|
|
+
|
|
|
+Once the data stream has a write index with an 8.x version we can proceed with reindexing the old indices.
|
|
|
+For each of the version 7.x indices, we now do the following:
|
|
|
+
|
|
|
+* Add a write block to the source index to guarantee that no writes are lost.
|
|
|
+* Open the source index if it is closed.
|
|
|
+* Delete the destination index if one exists. This is done in case we are retrying after a failure, so that we start with a fresh index.
|
|
|
+* Create the destination index using the <<indices-create-index-from-source, create from source API>>.
|
|
|
+This copies the settings and mappings from the old backing index to the new backing index.
|
|
|
+* Use the <<docs-reindex, reindex API>> to copy the contents of the old backing index to the new backing index.
|
|
|
+* Close the destination index if the source index was originally closed.
|
|
|
+* Replace the old index in the data stream with the new index, using the <<modify-data-streams-api,modify data streams API>>.
|
|
|
+* Finally, the old backing index is deleted.
|
|
|
+
|
|
|
+By default only one backing index will be processed at a time.
|
|
|
+This can be modified using the <<migrate_max_concurrent_indices_reindexed_per_data_stream-setting,`migrate_max_concurrent_indices_reindexed_per_data_stream-setting` setting>>.
|
|
|
+
|
|
|
+While the reindex data stream task is running, we can inspect the current status using the <<data-stream-reindex-status-api,reindex status API>>:
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+GET /_migration/reindex/my-data-stream/_status
|
|
|
+----
|
|
|
+// TEST[continued]
|
|
|
+
|
|
|
+For the above example, the following would be a possible status:
|
|
|
+
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+{
|
|
|
+ "start_time_millis": 1737676174349,
|
|
|
+ "complete": false,
|
|
|
+ "total_indices_in_data_stream": 4,
|
|
|
+ "total_indices_requiring_upgrade": 3,
|
|
|
+ "successes": 0,
|
|
|
+ "in_progress": [
|
|
|
+ {
|
|
|
+ "index": ".ds-my-data-stream-2025.01.23-000001",
|
|
|
+ "total_doc_count": 10000000,
|
|
|
+ "reindexed_doc_count": 999999
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "pending": 2,
|
|
|
+ "errors": []
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[skip:specific value is part of explanation]
|
|
|
+
|
|
|
+This output means that the first backing index, `.ds-my-data-stream-2025.01.23-000001`, is currently being processed,
|
|
|
+and none of the backing indices have yet completed. Notice that `total_indices_in_data_stream` has a value of `4`,
|
|
|
+because after the rollover, there are 4 indices in the data stream. But the new write index has an 8.x version, and
|
|
|
+thus doesn't need to be reindexed, so `total_indices_requiring_upgrade` is only 3.
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+[[reindex-data-stream-cancel-restart]]
|
|
|
+===== Cancelling and Restarting
|
|
|
+The <<reindex-data-stream-api-settings, reindex datastream settings>> provide a few ways to control the performance and
|
|
|
+resource usage of a reindex task. This example shows how we can stop a running reindex task, modify the settings,
|
|
|
+and restart the task.
|
|
|
+
|
|
|
+Continuing with the above example, assume the reindexing task has not yet completed, and the <<data-stream-reindex-status-api,reindex status API>>
|
|
|
+returns the following:
|
|
|
+
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+{
|
|
|
+ "start_time_millis": 1737676174349,
|
|
|
+ "complete": false,
|
|
|
+ "total_indices_in_data_stream": 4,
|
|
|
+ "total_indices_requiring_upgrade": 3,
|
|
|
+ "successes": 1,
|
|
|
+ "in_progress": [
|
|
|
+ {
|
|
|
+ "index": ".ds-my-data-stream-2025.01.23-000002",
|
|
|
+ "total_doc_count": 10000000,
|
|
|
+ "reindexed_doc_count": 1000
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "pending": 1,
|
|
|
+ "errors": []
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[skip:specific value is part of explanation]
|
|
|
+
|
|
|
+Let's assume the task has been running for a long time. By default, we throttle how many requests the reindex operation
|
|
|
+can execute per second. This keeps the reindex process from consuming too many resources.
|
|
|
+But the default value of `1000` request per second will not be correct for all use cases.
|
|
|
+The <<migrate_data_stream_reindex_max_request_per_second-setting,`migrate.data_stream_reindex_max_request_per_second` setting>>
|
|
|
+can be used to increase or decrease the number of requests per second, or to remove the throttle entirely.
|
|
|
+
|
|
|
+Changing this setting won't have an effect on the backing index that is currently being reindexed.
|
|
|
+For example, changing the setting won't have an effect on `.ds-my-data-stream-2025.01.23-000002`, but would have an
|
|
|
+effect on the next backing index.
|
|
|
+
|
|
|
+But in the above status, `.ds-my-data-stream-2025.01.23-000002` has values of 1000 and 10M for the
|
|
|
+`reindexed_doc_count` and `total_doc_count`, respectively. This means it has only reindexed 0.01% of the documents in the index.
|
|
|
+It might be a good time to cancel the run and optimize some settings without losing much work.
|
|
|
+So we call the <<data-stream-reindex-cancel-api,cancel API>>:
|
|
|
+
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+POST /_migration/reindex/my-data-stream/_cancel
|
|
|
+----
|
|
|
+// TEST[skip:task will not be present]
|
|
|
+
|
|
|
+Now we can use the <<cluster-update-settings, update cluster settings API>> to increase the throttle:
|
|
|
+
|
|
|
+[source,console]
|
|
|
+--------------------------------------------------
|
|
|
+PUT /_cluster/settings
|
|
|
+{
|
|
|
+ "persistent" : {
|
|
|
+ "migrate.data_stream_reindex_max_request_per_second" : 10000
|
|
|
+ }
|
|
|
+}
|
|
|
+--------------------------------------------------
|
|
|
+// TEST[continued]
|
|
|
+
|
|
|
+The <<reindex-data-stream-start,original reindex command>> can now be used to restart reindexing.
|
|
|
+Because the first backing index, `.ds-my-data-stream-2025.01.23-000001`, has already been reindexed and thus is already version 8.x,
|
|
|
+it will be skipped. The task will start by reindexing `.ds-my-data-stream-2025.01.23-000002` again from the beginning.
|
|
|
+
|
|
|
+Later, once all the backing indices have finished, the <<data-stream-reindex-status-api,reindex status API>> will return something like the following:
|
|
|
+
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+{
|
|
|
+ "start_time_millis": 1737676174349,
|
|
|
+ "complete": true,
|
|
|
+ "total_indices_in_data_stream": 4,
|
|
|
+ "total_indices_requiring_upgrade": 2,
|
|
|
+ "successes": 2,
|
|
|
+ "in_progress": [],
|
|
|
+ "pending": 0,
|
|
|
+ "errors": []
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[skip:specific value is part of explanation]
|
|
|
+
|
|
|
+Notice that the value of `total_indices_requiring_upgrade` is `2`, unlike the previous status, which had a value of `3`.
|
|
|
+This is because `.ds-my-data-stream-2025.01.23-000001` was upgraded before the task cancellation.
|
|
|
+After the restart, the API sees that it does not need to be upgraded, thus does not include it in `total_indices_requiring_upgrade` or `successes`,
|
|
|
+despite the fact that it upgraded successfully.
|
|
|
+
|
|
|
+The completed status will be accessible from the status API for 24 hours after completion of the task.
|
|
|
+
|
|
|
+We can now check the data stream to verify that indices were upgraded:
|
|
|
+
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+GET _data_stream/my-data-stream?filter_path=data_streams.indices.index_name
|
|
|
+----
|
|
|
+// TEST[continued]
|
|
|
+
|
|
|
+
|
|
|
+which returns:
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+{
|
|
|
+ "data_streams": [
|
|
|
+ {
|
|
|
+ "indices": [
|
|
|
+ {
|
|
|
+ "index_name": ".migrated-ds-my-data-stream-2025.01.23-000003"
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "index_name": ".migrated-ds-my-data-stream-2025.01.23-000002"
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "index_name": ".migrated-ds-my-data-stream-2025.01.23-000001"
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "index_name": ".ds-my-data-stream-2025.01.23-000004"
|
|
|
+ }
|
|
|
+ ]
|
|
|
+ }
|
|
|
+ ]
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[skip:did not actually run reindex]
|
|
|
+
|
|
|
+Index `.ds-my-data-stream-2025.01.23-000004` is the write index and didn't need to be upgraded because it was created with version 8.x.
|
|
|
+The other three backing indices are now prefixed with `.migrated` because they have been upgraded.
|
|
|
+
|
|
|
+We can now check the indices and verify that they have version 8.x:
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+GET .migrated-ds-my-data-stream-2025.01.23-000001?human&filter_path=*.settings.index.version.created_string
|
|
|
+----
|
|
|
+// TEST[skip:migrated index does not exist]
|
|
|
+
|
|
|
+which returns:
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+{
|
|
|
+ ".migrated-ds-my-data-stream-2025.01.23-000001": {
|
|
|
+ "settings": {
|
|
|
+ "index": {
|
|
|
+ "version": {
|
|
|
+ "created_string": "8.18.0"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[skip:migrated index does not exist]
|
|
|
+
|
|
|
+[[reindex-data-stream-handling-failure]]
|
|
|
+===== Handling Failures
|
|
|
+Since the reindex data stream API runs in the background, failure information can be obtained through the <<data-stream-reindex-status-api,reindex status API>>.
|
|
|
+For example, if the backing index `.ds-my-data-stream-2025.01.23-000002` was accidentally deleted by a user, we would see a status like the following:
|
|
|
+
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+{
|
|
|
+ "start_time_millis": 1737676174349,
|
|
|
+ "complete": false,
|
|
|
+ "total_indices_in_data_stream": 4,
|
|
|
+ "total_indices_requiring_upgrade": 3,
|
|
|
+ "successes": 1,
|
|
|
+ "in_progress": [],
|
|
|
+ "pending": 1,
|
|
|
+ "errors": [
|
|
|
+ {
|
|
|
+ "index": ".ds-my-data-stream-2025.01.23-000002",
|
|
|
+ "message": "index [.ds-my-data-stream-2025.01.23-000002] does not exist"
|
|
|
+ }
|
|
|
+ ]
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[skip:result just part of explanation]
|
|
|
+
|
|
|
+Once the issue has been fixed, the failed reindex task can be re-run. First, the failed run's status must be cleared
|
|
|
+using the <<data-stream-reindex-cancel-api,reindex cancel API>>. Then the
|
|
|
+<<reindex-data-stream-start,original reindex command>> can be called to pick up where it left off.
|