Browse Source

[DOCS] Reformat data stream tutorial docs (#57883)

Creates a new page for a 'Set up a data stream' tutorial, based on
existing content in 'Data streams'.

Also adds tutorials for:

* Configuring an ILM policy for a data stream
* Indexing documents to a data stream
* Searching a data stream
* Manually rolling over a data stream
James Rodewig 5 years ago
parent
commit
b1e28d9f84

+ 14 - 72
docs/reference/data-streams.asciidoc → docs/reference/data-streams/data-streams.asciidoc

@@ -1,7 +1,8 @@
-[chapter]
 [[data-streams]]
-= Data Streams
+= Data streams
 
+[partintro]
+--
 You can use data streams to index time-based data that's continuously generated.
 A data stream groups indices from the same time-based data source.
 A data stream tracks its indices, known as _backing indices_, using an ordered
@@ -44,76 +45,6 @@ generation of the data stream. For example, a data stream named
 conform to this naming convention if operations such as
 <<indices-shrink-index,shrink>> have been performed on them.
 
-[discrete]
-[[create-data-stream]]
-== Create a data stream
-
-Create a composable template with a `data_stream` definition:
-
-[source,console]
------------------------------------
-PUT /_index_template/logs_template
-{
-  "index_patterns": ["logs-*"],
-  "data_stream": {
-    "timestamp_field": "@timestamp"
-  }
-}
------------------------------------
-
-Start indexing data to a target matching the composable template's wildcard
-pattern:
-
-[source,console]
-----
-POST /logs-foobar/_doc
-{
-  "@timestamp": "2050-11-15T14:12:12",
-  ...
-}
-----
-// TEST[continued]
-// TEST[s/,//]
-// TEST[s/\.\.\.//]
-
-Response:
-
-[source,console-result]
---------------------------------------------------
-{
-    "_shards" : {
-        "total" : 2,
-        "failed" : 0,
-        "successful" : 1
-    },
-    "_index" : ".ds-logs-foobar-000001",
-    "_id" : "W0tpsmIBdwcYyG50zbta",
-    "_version" : 1,
-    "_seq_no" : 0,
-    "_primary_term" : 1,
-    "result": "created"
-}
---------------------------------------------------
-// TESTRESPONSE[s/W0tpsmIBdwcYyG50zbta/$body._id/]
-
-Or create a data stream using the create data stream API:
-
-[source,console]
---------------------------------------------------
-PUT /_data_stream/logs-barbaz
---------------------------------------------------
-// TEST[continued]
-
-////
-[source,console]
------------------------------------
-DELETE /_data_stream/logs-foobar
-DELETE /_data_stream/logs-barbaz
-DELETE /_index_template/logs_template
------------------------------------
-// TEST[continued]
-////
-
 [discrete]
 [[data-streams-apis]]
 == Data stream APIs
@@ -123,3 +54,14 @@ The following APIs are available for managing data streams:
 * To get information about data streams, use the <<indices-get-data-stream, get data stream API>>.
 * To delete data streams, use the <<indices-delete-data-stream, delete data stream API>>.
 * To manually create a data stream, use the <<indices-create-data-stream, create data stream API>>.
+
+[discrete]
+[[data-streams-toc]]
+== In this section
+
+* <<set-up-a-data-stream>>
+* <<use-a-data-stream>>
+--
+
+include::set-up-a-data-stream.asciidoc[]
+include::use-a-data-stream.asciidoc[]

+ 242 - 0
docs/reference/data-streams/set-up-a-data-stream.asciidoc

@@ -0,0 +1,242 @@
+[[set-up-a-data-stream]]
+== Set up a data stream
+
+To set up a data stream, follow these steps:
+
+. Check the <<data-stream-prereqs, prerequisites>>.
+. <<configure-a-data-stream-ilm-policy>>.
+. <<create-a-data-stream-template>>.
+. <<create-a-data-stream>>.
+
+After you set up a data stream, you can <<use-a-data-stream, use the data
+stream>> for indexing, searches, and other supported operations.
+
+[discrete]
+[[data-stream-prereqs]]
+=== Prerequisites
+
+* {es} data streams are intended for time-series data only. Each document
+indexed to a data stream must contain a shared timestamp field.
++
+TIP: Data streams work well with most common log formats. While no schema is
+required to use data streams, we recommend the {ecs-ref}[Elastic Common Schema
+(ECS)].
+
+* Data streams are designed to be append-only. While you can index new documents
+directly to a data stream, you cannot use a data stream to directly update or
+delete individual documents. To update or delete specific documents in a data
+stream, submit a <<docs-delete,delete>> or <<docs-update,update>> API request to
+the backing index containing the document.
+
+
+[discrete]
+[[configure-a-data-stream-ilm-policy]]
+=== Optional: Configure an {ilm-init} lifecycle policy for a data stream
+
+You can use <<index-lifecycle-management,{ilm} ({ilm-init})>> to automatically
+manage a data stream's backing indices. For example, you could use {ilm-init}
+to:
+
+* Spin up a new write index for the data stream when the current one reaches a
+  certain size or age.
+* Move older backing indices to slower, less expensive hardware.
+* Delete stale backing indices to enforce data retention standards.
+
+To use {ilm-init} with a data stream, you must
+<<set-up-lifecycle-policy,configure a lifecycle policy>>. This lifecycle policy
+should contain the automated actions to take on backing indices and the
+triggers for such actions.
+
+TIP: While optional, we recommend using {ilm-init} to scale data streams in
+production.
+
+.*Example*
+[%collapsible]
+====
+The following <<ilm-put-lifecycle,create lifecycle policy API>> request
+configures the `logs_policy` lifecycle policy.
+
+The `logs_policy` policy uses the <<ilm-rollover,`rollover` action>> to create a
+new write index for the data stream when the current one reaches 25GB in size.
+The policy also deletes backing indices 30 days after their rollover.
+
+[source,console]
+----
+PUT /_ilm/policy/logs_policy
+{
+  "policy": {
+    "phases": {
+      "hot": {
+        "actions": {
+          "rollover": {
+            "max_size": "25GB"
+          }
+        }
+      },
+      "delete": {
+        "min_age": "30d",
+        "actions": {
+          "delete": {}
+        }
+      }
+    }
+  }
+}
+----
+====
+
+
+[discrete]
+[[create-a-data-stream-template]]
+=== Create a composable template for a data stream
+
+Each data stream requires a <<indices-templates,composable template>>. The data
+stream uses this template to create its backing indices.
+
+Composable templates for data streams must contain:
+
+* A name or wildcard (`*`) pattern for the data stream in the `index_patterns`
+  property.
+
+* A `data_stream` definition containing the `timestamp_field` property.
+  This timestamp field must be included in every document indexed to the data
+  stream.
+
+* A <<date,`date`>> or <<date_nanos,`date_nanos`>> field mapping for the
+  timestamp field specified in the `timestamp_field` property.
+
+* If you intend to use {ilm-init}, you must specify the
+  <<configure-a-data-stream-ilm-policy,lifecycle policy>> in the 
+  `index.lifecycle.name` setting.
+
+You can also specify other mappings and settings you'd like to apply to the
+stream's backing indices.
+
+.*Example*
+[%collapsible]
+====
+The following <<indices-templates,put composable template API>> request
+configures the `logs_data_stream` template.
+
+[source,console]
+----
+PUT /_index_template/logs_data_stream
+{
+  "index_patterns": [ "logs*" ],
+  "data_stream": {
+    "timestamp_field": "@timestamp"
+  },
+  "template": {
+    "mappings": {
+      "properties": {
+        "@timestamp": {
+          "type": "date"
+        }
+      }
+    },
+    "settings": {
+      "index.lifecycle.name": "logs_policy"
+    }
+  }
+}
+----
+// TEST[continued]
+====
+
+[discrete]
+[[create-a-data-stream]]
+=== Create a data stream
+
+With a composable template, you can create a data stream using one of two
+methods:
+
+* Submit an <<add-documents-to-a-data-stream,indexing request>> to a target
+matching the name or wildcard pattern defined in the template's `index_patterns`
+property.
++
+--
+If the indexing request's target doesn't exist, {es} creates the data stream and
+uses the target name as the name for the stream.
+
+NOTE: Data streams support only specific types of indexing requests. See
+<<add-documents-to-a-data-stream>>.
+
+.*Example: Index documents to create a data stream*
+[%collapsible]
+====
+The following <<docs-index_,index API>> request targets `logs`, which matches
+the wildcard pattern for the `logs_data_stream` template. Because no existing
+index or data stream uses this name, this request creates the `logs` data stream
+and indexes the document to it.
+
+[source,console]
+----
+POST /logs/_doc/
+{
+  "@timestamp": "2020-12-06T11:04:05.000Z",
+  "user": {
+    "id": "vlb44hny"
+  },
+  "message": "Login attempt failed"
+}
+----
+// TEST[continued]
+
+The API returns the following response. Note the `_index` property contains
+`.ds-logs-000001`, indicating the document was indexed to the write index of the
+new `logs` data stream.
+
+[source,console-result]
+----
+{
+  "_index": ".ds-logs-000001",
+  "_id": "qecQmXIBT4jB8tq1nG0j",
+  "_version": 1,
+  "result": "created",
+  "_shards": {
+    "total": 2,
+    "successful": 1,
+    "failed": 0
+  },
+  "_seq_no": 0,
+  "_primary_term": 1
+}
+----
+// TESTRESPONSE[s/"_id": "qecQmXIBT4jB8tq1nG0j"/"_id": $body._id/]
+====
+--
+
+* Use the <<indices-create-data-stream,create data stream API>> to manually
+create a data stream. The name of the data stream must match the
+name or wildcard pattern defined in the template's `index_patterns` property.
++
+--
+.*Example: Manually create a data stream*
+[%collapsible]
+====
+The following <<indices-create-data-stream,create data stream API>> request
+targets `logs_alt`, which matches the wildcard pattern for the
+`logs_data_stream` template. Because no existing index or data stream uses this
+name, this request creates the `logs_alt` data stream.
+
+[source,console]
+----
+PUT /_data_stream/logs_alt
+----
+// TEST[continued]
+====
+--
+
+////
+[source,console]
+----
+DELETE /_data_stream/logs
+
+DELETE /_data_stream/logs_alt
+
+DELETE /_index_template/logs_data_stream
+
+DELETE /_ilm/policy/logs_policy
+----
+// TEST[continued]
+////

+ 185 - 0
docs/reference/data-streams/use-a-data-stream.asciidoc

@@ -0,0 +1,185 @@
+[[use-a-data-stream]]
+== Use a data stream
+
+After you <<set-up-a-data-stream,set up a data stream set up>>, you can do
+the following:
+
+* <<add-documents-to-a-data-stream>>
+* <<search-a-data-stream>>
+* <<manually-roll-over-a-data-stream>>
+
+////
+[source,console]
+----
+PUT /_index_template/logs_data_stream
+{
+  "index_patterns": [ "logs*" ],
+  "data_stream": {
+    "timestamp_field": "@timestamp"
+  },
+  "template": {
+    "mappings": {
+      "properties": {
+        "@timestamp": {
+          "type": "date"
+        }
+      }
+    }
+  }
+}
+
+PUT /_data_stream/logs
+----
+////
+
+[discrete]
+[[add-documents-to-a-data-stream]]
+=== Add documents to a data stream
+
+You can add documents to a data stream using the following requests:
+
+* An <<docs-index_,index API>> request with an
+<<docs-index-api-op_type,`op_type`>> set to `create`. Specify the data
+stream's name in place of an index name.
++
+--
+NOTE: The `op_type` parameter defaults to `create` when adding new documents.
+
+.*Example: Index API request*
+[%collapsible]
+====
+The following <<docs-index_,index API>> adds a new document to the `logs` data
+stream.
+
+[source,console]
+----
+POST /logs/_doc/
+{
+  "@timestamp": "2020-12-07T11:06:07.000Z",
+  "user": {
+    "id": "8a4f500d"
+  },
+  "message": "Login successful"
+}
+----
+// TEST[continued]
+====
+--
+
+* A <<docs-bulk,bulk API>> request using the `create` action. Specify the data
+stream's name in place of an index name.
++
+--
+NOTE: Data streams do not support other bulk actions, such as `index`.
+
+.*Example: Bulk API request*
+[%collapsible]
+====
+The following <<docs-bulk,bulk API>> index request adds several new documents to
+the `logs` data stream. Note that only the `create` action is used.
+
+[source,console]
+----
+PUT /logs/_bulk?refresh
+{"create":{"_index" : "logs"}}
+{ "@timestamp": "2020-12-08T11:04:05.000Z", "user": { "id": "vlb44hny" }, "message": "Login attempt failed" }
+{"create":{"_index" : "logs"}}
+{ "@timestamp": "2020-12-08T11:06:07.000Z", "user": { "id": "8a4f500d" }, "message": "Login successful" }
+{"create":{"_index" : "logs"}}
+{ "@timestamp": "2020-12-09T11:07:08.000Z", "user": { "id": "l7gk7f82" }, "message": "Logout successful" }
+----
+// TEST[continued]
+====
+--
+
+[discrete]
+[[search-a-data-stream]]
+=== Search a data stream
+
+The following search APIs support data streams:
+
+* <<search-search, Search>>
+* <<async-search, Async search>>
+* <<search-multi-search, Multi search>>
+* <<search-field-caps, Field capabilities>>
+////
+* <<eql-search-api, EQL search>>
+////
+
+.*Example*
+[%collapsible]
+====
+The following <<search-search,search API>> request searches the `logs` data
+stream for documents with a timestamp between today and yesterday that also have
+`message` value of `login successful`.
+
+[source,console]
+----
+GET /logs/_search
+{
+  "query": {
+    "bool": {
+      "must": {
+        "range": {
+          "@timestamp": {
+            "gte": "now-1d/d",
+            "lt": "now/d"
+          }
+        }
+      },
+      "should": {
+        "match": {
+          "message": "login successful"
+        }
+      }
+    }
+  }
+}
+----
+// TEST[continued]
+====
+
+[discrete]
+[[manually-roll-over-a-data-stream]]
+=== Manually roll over a data stream
+
+A rollover creates a new backing index for a data stream. This new backing index
+becomes the stream's new write index and increments the stream's generation.
+
+In most cases, we recommend using <<index-lifecycle-management,{ilm-init}>> to
+automate rollovers for data streams. This lets you automatically roll over the
+current write index when it meets specified criteria, such as a maximum age or
+size.
+
+However, you can also use the <<indices-rollover-index,rollover API>> to
+manually perform a rollover. This can be useful if you want to apply mapping or
+setting changes to the stream's write index after updating a data stream's
+template.
+
+.*Example*
+[%collapsible]
+====
+The following <<indices-rollover-index,rollover API>> request submits a manual
+rollover request for the `logs` data stream.
+
+[source,console]
+----
+POST /logs/_rollover/
+{
+  "conditions": {
+    "max_docs":   "1"
+  }
+}
+----
+// TEST[continued]
+====
+
+////
+[source,console]
+----
+DELETE /_data_stream/logs
+
+DELETE /_index_template/logs_data_stream
+----
+// TEST[continued]
+////

+ 1 - 0
docs/reference/docs/index_.asciidoc

@@ -41,6 +41,7 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=if_seq_no]
 
 include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=if_primary_term]
 
+[[docs-index-api-op_type]]
 `op_type`::
 (Optional, enum) Set to `create` to only index the document
 if it does not already exist (_put if absent_). If a document with the specified

+ 2 - 2
docs/reference/index.asciidoc

@@ -18,6 +18,8 @@ include::setup.asciidoc[]
 
 include::upgrade.asciidoc[]
 
+include::data-streams/data-streams.asciidoc[]
+
 include::search/index.asciidoc[]
 
 include::query-dsl.asciidoc[]
@@ -58,8 +60,6 @@ include::high-availability.asciidoc[]
 
 include::snapshot-restore/index.asciidoc[]
 
-include::data-streams.asciidoc[]
-
 include::{xes-repo-dir}/security/index.asciidoc[]
 
 include::{xes-repo-dir}/watcher/index.asciidoc[]