5 years ago · d5e6b13151
--- a/docs/reference/data-streams/use-a-data-stream.asciidoc
+++ b/docs/reference/data-streams/use-a-data-stream.asciidoc
@@ -7,6 +7,7 @@ the following:
 
				 * <<add-documents-to-a-data-stream>>
			
 
				 * <<search-a-data-stream>>
			
 
				 * <<manually-roll-over-a-data-stream>>
			
 
				+* <<reindex-with-a-data-stream>>
			
 
				 
			
 
				 ////
			
 
				 [source,console]
			
@@ -175,6 +176,109 @@ POST /logs/_rollover/
 
				 // TEST[continued]
			
 
				 ====
			
 
				 
			
 
				+[discrete]
			
 
				+[[reindex-with-a-data-stream]]
			
 
				+=== Reindex with a data stream
			
 
				+
			
 
				+You can use the <<docs-reindex,reindex API>> to copy documents to a data stream
			
 
				+from an existing index, index alias, or data stream.
			
 
				+
			
 
				+A reindex copies documents from a _source_ to a _destination_. The source and
			
 
				+destination can be any pre-existing index, index alias, or data stream. However,
			
 
				+the source and destination must be different. You cannot reindex a data stream
			
 
				+into itself.
			
 
				+
			
 
				+Because data streams are <<data-streams-append-only,append-only>>, a reindex
			
 
				+request to a data stream destination must have an `op_type` of `create`. This
			
 
				+means a reindex can only add new documents to a data stream. It cannot update
			
 
				+existing documents in the data stream destination.
			
 
				+
			
 
				+A reindex can be used to:
			
 
				+
			
 
				+* Convert an existing index alias and collection of time-based indices into a
			
 
				+  data stream.
			
 
				+
			
 
				+* Apply a new or updated <<create-a-data-stream-template,composable template>>
			
 
				+  by reindexing an existing data stream into a new one. This applies mapping
			
 
				+  and setting changes in the template to each document and backing index of the
			
 
				+  data stream destination.
			
 
				+
			
 
				+TIP: If you only want to update the mappings or settings of a data stream's
			
 
				+write index, we recommend you update the <<create-a-data-stream-template,data
			
 
				+stream's template>> and perform a <<manually-roll-over-a-data-stream,rollover>>.
			
 
				+
			
 
				+.*Example*
			
 
				+[%collapsible]
			
 
				+====
			
 
				+The following reindex request copies documents from the `archive` index alias to
			
 
				+the existing `logs` data stream. Because the destination is a data stream, the
			
 
				+the request's `op_type` is `create`.
			
 
				+
			
 
				+////
			
 
				+[source,console]
			
 
				+----
			
 
				+PUT /_bulk?refresh=wait_for
			
 
				+{"create":{"_index" : "archive_1"}}
			
 
				+{ "@timestamp": "2020-12-08T11:04:05.000Z" }
			
 
				+{"create":{"_index" : "archive_2"}}
			
 
				+{ "@timestamp": "2020-12-08T11:06:07.000Z" }
			
 
				+{"create":{"_index" : "archive_2"}}
			
 
				+{ "@timestamp": "2020-12-09T11:07:08.000Z" }
			
 
				+{"create":{"_index" : "archive_2"}}
			
 
				+{ "@timestamp": "2020-12-09T11:07:08.000Z" }
			
 
				+
			
 
				+POST /_aliases
			
 
				+{
			
 
				+  "actions" : [
			
 
				+    { "add" : { "index" : "archive_1", "alias" : "archive" } },
			
 
				+    { "add" : { "index" : "archive_2", "alias" : "archive", "is_write_index" : true} }
			
 
				+  ]
			
 
				+}
			
 
				+----
			
 
				+// TEST[continued]
			
 
				+////
			
 
				+
			
 
				+[source,console]
			
 
				+----
			
 
				+POST /_reindex
			
 
				+{
			
 
				+  "source": {
			
 
				+    "index": "archive"
			
 
				+  },
			
 
				+  "dest": {
			
 
				+    "index": "logs",
			
 
				+    "op_type": "create"
			
 
				+  }
			
 
				+}
			
 
				+----
			
 
				+// TEST[continued]
			
 
				+====
			
 
				+
			
 
				+You can also reindex documents from a data stream to an index, index
			
 
				+alias, or data stream.
			
 
				+
			
 
				+.*Example*
			
 
				+[%collapsible]
			
 
				+====
			
 
				+The following reindex request copies documents from the `logs` data stream
			
 
				+to the existing `archive` index alias. Because the destination is not a data
			
 
				+stream, the `op_type` does not need to be specified.
			
 
				+
			
 
				+[source,console]
			
 
				+----
			
 
				+POST /_reindex
			
 
				+{
			
 
				+  "source": {
			
 
				+    "index": "logs"
			
 
				+  },
			
 
				+  "dest": {
			
 
				+    "index": "archive"
			
 
				+  }
			
 
				+}
			
 
				+----
			
 
				+// TEST[continued]
			
 
				+====
			
 
				+
			
 
				 ////
			
 
				 [source,console]
			
 
				 ----
			
--- a/docs/reference/docs/reindex.asciidoc
+++ b/docs/reference/docs/reindex.asciidoc
@@ -4,15 +4,19 @@
 
				 <titleabbrev>Reindex</titleabbrev>
			
 
				 ++++
			
 
				 
			
 
				-Copies documents from one index to another. 
			
 
				+Copies documents from a _source_ to a _destination_.
			
 
				+
			
 
				+The source and destination can be any pre-existing index, index alias, or
			
 
				+<<data-streams,data stream>>. However, the source and destination must be 
			
 
				+different. For example, you cannot reindex a data stream into itself.
			
 
				 
			
 
				 [IMPORTANT]
			
 
				 =================================================
			
 
				 Reindex requires <<mapping-source-field,`_source`>> to be enabled for
			
 
				-all documents in the source index.
			
 
				+all documents in the source.
			
 
				 
			
 
				-You must set up the destination index before calling `_reindex`.
			
 
				-Reindex does not copy the settings from the source index. 
			
 
				+The destination must exist and should be configured as wanted before calling `_reindex`.
			
 
				+Reindex does not copy the settings from the source or its associated template. 
			
 
				 Mappings, shard counts, replicas, and so on must be configured ahead of time.
			
 
				 =================================================
			
 
				 
			
@@ -66,25 +70,30 @@ POST _reindex
 
				 [[docs-reindex-api-desc]]
			
 
				 ==== {api-description-title}
			
 
				 
			
 
				-Extracts the document source from the source index and indexes the documents into the destination index. 
			
 
				-You can copy all documents to the destination index, or reindex a subset of the documents. 
			
 
				+Extracts the <<mapping-source-field,document source>> from the reindex request's source and indexes the documents into the destination. 
			
 
				+You can copy all documents to the destination, or reindex a subset of the documents. 
			
 
				 
			
 
				 Just like <<docs-update-by-query,`_update_by_query`>>, `_reindex` gets a
			
 
				-snapshot of the source index but its target must be a **different** index so
			
 
				+snapshot of the source but its destination must be **different** so
			
 
				 version conflicts are unlikely. The `dest` element can be configured like the
			
 
				 index API to control optimistic concurrency control. Omitting
			
 
				 `version_type` or setting it to `internal` causes Elasticsearch
			
 
				-to blindly dump documents into the target, overwriting any that happen to have
			
 
				+to blindly dump documents into the destination, overwriting any that happen to have
			
 
				 the same ID.
			
 
				 
			
 
				 Setting `version_type` to `external` causes Elasticsearch to preserve the
			
 
				 `version` from the source, create any documents that are missing, and update
			
 
				-any documents that have an older version in the destination index than they do
			
 
				-in the source index.
			
 
				+any documents that have an older version in the destination than they do
			
 
				+in the source.
			
 
				 
			
 
				 Setting `op_type` to `create` causes `_reindex` to only create missing
			
 
				-documents in the target index. All existing documents will cause a version
			
 
				-conflict. 
			
 
				+documents in the destination. All existing documents will cause a version
			
 
				+conflict.
			
 
				+
			
 
				+IMPORTANT: Because data streams are <<data-streams-append-only,append-only>>,
			
 
				+any reindex request to a destination data stream must have an `op_type`
			
 
				+of`create`. A reindex can only add new documents to a destination data stream.
			
 
				+It cannot update existing documents in a destination data stream.
			
 
				 
			
 
				 By default, version conflicts abort the `_reindex` process. 
			
 
				 To continue reindexing if there are conflicts, set the `"conflicts"` request body parameter to `proceed`. 
			
@@ -101,13 +110,13 @@ performs some preflight checks, launches the request, and returns a
 
				 When you are done with a task, you should delete the task document so 
			
 
				 {es} can reclaim the space.
			
 
				 
			
 
				-[[docs-reindex-many-indices]]
			
 
				-===== Reindexing many indices
			
 
				-If you have many indices to reindex it is generally better to reindex them
			
 
				-one at a time rather than using a glob pattern to pick up many indices. That
			
 
				+[[docs-reindex-from-multiple-sources]]
			
 
				+===== Reindex from multiple sources
			
 
				+If you have many sources to reindex it is generally better to reindex them
			
 
				+one at a time rather than using a glob pattern to pick up multiple sources. That
			
 
				 way you can resume the process if there are any errors by removing the
			
 
				-partially completed index and starting over at that index. It also makes
			
 
				-parallelizing the process fairly simple: split the list of indices to reindex
			
 
				+partially completed source and starting over. It also makes
			
 
				+parallelizing the process fairly simple: split the list of sources to reindex
			
 
				 and run each list in parallel.
			
 
				 
			
 
				 One-off bash scripts seem to work nicely for this:
			
@@ -283,10 +292,11 @@ which results in a sensible `total` like this one:
 
				 }
			
 
				 ----------------------------------------------------------------
			
 
				 
			
 
				-Setting `slices` to `auto` will let Elasticsearch choose the number of slices
			
 
				-to use. This setting will use one slice per shard, up to a certain limit. If
			
 
				-there are multiple source indices, it will choose the number of slices based
			
 
				-on the index with the smallest number of shards.
			
 
				+Setting `slices` to `auto` will let Elasticsearch choose the number of slices to
			
 
				+use. This setting will use one slice per shard, up to a certain limit. If there
			
 
				+are multiple sources, it will choose the number of
			
 
				+slices based on the index or <<data-streams,backing index>> with the smallest
			
 
				+number of shards.
			
 
				 
			
 
				 Adding `slices` to `_reindex` just automates the manual process used in the
			
 
				 section above, creating sub-requests which means it has some quirks:
			
@@ -308,7 +318,7 @@ be larger than others. Expect larger slices to have a more even distribution.
 
				 the point above about distribution being uneven and you should conclude that
			
 
				 using `max_docs` with `slices` might not result in exactly `max_docs` documents
			
 
				 being reindexed.
			
 
				-* Each sub-request gets a slightly different snapshot of the source index,
			
 
				+* Each sub-request gets a slightly different snapshot of the source,
			
 
				 though these are all taken at approximately the same time.
			
 
				 
			
 
				 [[docs-reindex-picking-slices]]
			
@@ -352,7 +362,7 @@ Sets the routing on the bulk request sent for each match to all text after
 
				 the `=`.
			
 
				 
			
 
				 For example, you can use the following request to copy all documents from
			
 
				-the `source` index with the company name `cat` into the `dest` index with
			
 
				+the `source` with the company name `cat` into the `dest`  with
			
 
				 routing set to `cat`.
			
 
				 
			
 
				 [source,console]
			
@@ -442,8 +452,8 @@ Defaults to `abort`.
 
				 
			
 
				 `source`::
			
 
				 `index`:::
			
 
				-(Required, string) The name of the index you are copying _from_. 
			
 
				-Also accepts a comma-separated list of indices to reindex from multiple sources.  
			
 
				+(Required, string) The name of the data stream, index, or index alias you are copying _from_. 
			
 
				+Also accepts a comma-separated list to reindex from multiple sources.  
			
 
				 
			
 
				 `max_docs`:::
			
 
				 (Optional, integer) The maximum number of documents to reindex.
			
@@ -491,7 +501,7 @@ Defaults to `true`.
 
				 
			
 
				 `dest`::
			
 
				 `index`:::
			
 
				-(Required, string) The name of the index you are copying _to_.
			
 
				+(Required, string) The name of the data stream, index, or index alias you are copying _to_.
			
 
				 
			
 
				 `version_type`:::
			
 
				 (Optional, enum) The versioning to use for the indexing operation.  
			
@@ -501,6 +511,9 @@ See <<index-version-types>> for more information.
 
				 `op_type`::: 
			
 
				 (Optional, enum) Set to create to only index documents that do not already exist (put if absent). 
			
 
				 Valid values: `index`, `create`. Defaults to `index`.
			
 
				++
			
 
				+IMPORTANT: To reindex to a data stream destination, this argument must be
			
 
				+`create`.
			
 
				 
			
 
				 `script`::
			
 
				 `source`::: 
			
@@ -629,8 +642,8 @@ POST _reindex
 
				 --------------------------------------------------
			
 
				 // TEST[setup:twitter]
			
 
				 
			
 
				-[[docs-reindex-multiple-indices]]
			
 
				-===== Reindex from multiple indices
			
 
				+[[docs-reindex-multiple-sources]]
			
 
				+===== Reindex from multiple sources
			
 
				 
			
 
				 The `index` attribute in `source` can be a list, allowing you to copy from lots 
			
 
				 of sources in one request. This will copy documents from the
			
@@ -794,9 +807,9 @@ The previous method can also be used in conjunction with <<docs-reindex-change-n
 
				 to load only the existing data into the new index and rename any fields if needed.
			
 
				 
			
 
				 [[docs-reindex-api-subset]]
			
 
				-===== Extract a random subset of an index
			
 
				+===== Extract a random subset of the source
			
 
				 
			
 
				-`_reindex` can be used to extract a random subset of an index for testing:
			
 
				+`_reindex` can be used to extract a random subset of the source for testing:
			
 
				 
			
 
				 [source,console]
			
 
				 ----------------------------------------------------------------
			
@@ -849,18 +862,18 @@ POST _reindex
 
				 // TEST[setup:twitter]
			
 
				 
			
 
				 Just as in `_update_by_query`, you can set `ctx.op` to change the
			
 
				-operation that is executed on the destination index:
			
 
				+operation that is executed on the destination:
			
 
				 
			
 
				 `noop`::
			
 
				 
			
 
				 Set `ctx.op = "noop"` if your script decides that the document doesn't have
			
 
				-to be indexed in the destination index. This no operation will be reported
			
 
				+to be indexed in the destination. This no operation will be reported
			
 
				 in the `noop` counter in the <<docs-reindex-api-response-body, response body>>.
			
 
				 
			
 
				 `delete`::
			
 
				 
			
 
				 Set `ctx.op = "delete"` if your script decides that the document must be
			
 
				- deleted from the destination index. The deletion will be reported in the
			
 
				+ deleted from the destination. The deletion will be reported in the
			
 
				  `deleted` counter in the <<docs-reindex-api-response-body, response body>>.
			
 
				 
			
 
				 Setting `ctx.op` to anything else will return an error, as will setting any
			
@@ -876,7 +889,7 @@ change:
 
				 
			
 
				 Setting `_version` to `null` or clearing it from the `ctx` map is just like not
			
 
				 sending the version in an indexing request; it will cause the document to be
			
 
				-overwritten in the target index regardless of the version on the target or the
			
 
				+overwritten in the destination regardless of the version on the target or the
			
 
				 version type you use in the `_reindex` request.
			
 
				 
			
 
				 [[reindex-from-remote]]
			
--- a/docs/reference/glossary.asciidoc
+++ b/docs/reference/glossary.asciidoc
@@ -352,11 +352,20 @@ during the following processes:
 
				 --
			
 
				 
			
 
				 [[glossary-reindex]] reindex ::
			
 
				-
			
 
				++
			
 
				+--
			
 
				 // tag::reindex-def[]
			
 
				-To cycle through some or all documents in one or more indices, re-writing them into the same 
			
 
				-or new index in a local or remote cluster. This is most commonly done to update mappings, or to upgrade {es} between two incompatible index versions.
			
 
				+Copies documents from a _source_ to a _destination_. The source and
			
 
				+destination can be any pre-existing index, index alias, or
			
 
				+{ref}/data-streams.html[data stream].
			
 
				+
			
 
				+You can reindex all documents from a source or select a subset of documents to
			
 
				+copy. You can also reindex to a destination in a remote cluster.
			
 
				+
			
 
				+A reindex is often performed to update mappings, change static index settings,
			
 
				+or upgrade {es} between incompatible versions.
			
 
				 // end::reindex-def[]
			
 
				+--
			
 
				 
			
 
				 [[glossary-remote-cluster]] remote cluster ::