Răsfoiți Sursa

[DOCS] Reformat data streams intro and overview (#57954)

Changes:

* Updates 'Data streams' intro page to focus on problem solution and
  benefits.

* Adds 'Data streams overview' page to cover conceptual information,
  based on existing content in the 'Data streams' intro.

* Adds diagrams for data streams and search/indexing request examples.

* Moves API jump list and API docs to a new 'Data streams APIs' section.
  Links to these APIs will be available through tutorials.

* Add xrefs to existing docs for concepts like generation, write index,
  and append-only.
James Rodewig 5 ani în urmă
părinte
comite
30cc10d3ac

+ 14 - 0
docs/reference/data-streams/data-stream-apis.asciidoc

@@ -0,0 +1,14 @@
+[[data-stream-apis]]
+== Data stream APIs
+
+The following APIs are available for managing data streams:
+
+* To get information about data streams, use the <<indices-get-data-stream, get data stream API>>.
+* To delete data streams, use the <<indices-delete-data-stream, delete data stream API>>.
+* To manually create a data stream, use the <<indices-create-data-stream, create data stream API>>.
+
+include::{es-repo-dir}/indices/create-data-stream.asciidoc[]
+
+include::{es-repo-dir}/indices/get-data-stream.asciidoc[]
+
+include::{es-repo-dir}/indices/delete-data-stream.asciidoc[]

+ 142 - 0
docs/reference/data-streams/data-streams-overview.asciidoc

@@ -0,0 +1,142 @@
+[[data-streams-overview]]
+== Data streams overview
+++++
+<titleabbrev>Overview</titleabbrev>
+++++
+
+A data stream consists of one or more _backing indices_. Backing indices are
+<<index-hidden,hidden>>, automatically-generated indices used to store a
+stream's documents.
+
+image::images/data-streams/data-streams-diagram.svg[align="center"]
+
+The creation of a data stream requires an associated
+<<indices-templates,composable template>>. This template acts as a blueprint for
+the stream's backing indices. It contains:
+
+* A name or wildcard (`*`) pattern for the data stream.
+
+* The data stream's _timestamp field_. This field must be mapped as a
+  <<date,`date`>> or <<date_nanos,`date_nanos`>> field datatype and must be
+  included in every document indexed to the data stream.
+
+* The mappings and settings applied to each backing index when it's created.
+
+The same composable template can be used to create multiple data streams.
+See <<set-up-a-data-stream>>.
+
+[discrete]
+[[data-streams-generation]]
+=== Generation
+
+Each data stream tracks its _generation_: a six-digit, zero-padded integer
+that acts as a cumulative count of the data stream's backing indices. This count
+includes any deleted indices for the stream. The generation is incremented
+whenever a new backing index is added to the stream.
+
+When a backing index is created, the index is named using the following
+convention:
+
+[source,text]
+----
+.ds-<data-stream>-<generation>
+----
+
+.*Example*
+[%collapsible]
+====
+The `web_server_logs` data stream has a generation of `34`. The most recently
+created backing index for this data stream is named
+`.ds-web_server_logs-000034`.
+====
+
+Because the generation increments with each new backing index, backing indices
+with a higher generation contain more recent data. Backing indices with a lower
+generation contain older data.
+
+A backing index's name can change after its creation due to a
+<<indices-shrink-index,shrink>>, <<snapshots-restore-snapshot,restore>>, or
+other operations.
+
+[discrete]
+[[data-stream-write-index]]
+=== Write index
+
+When a read request is sent to a data stream, it routes the request to all its
+backing indices. For example, a search request sent to a data stream would query
+all its backing indices.
+
+image::images/data-streams/data-streams-search-request.svg[align="center"]
+
+However, the most recently created backing index is the data stream’s only
+_write index_. The data stream routes all indexing requests for new documents to
+this index.
+
+image::images/data-streams/data-streams-index-request.svg[align="center"]
+
+You cannot add new documents to a stream's other backing indices, even by
+sending requests directly to the index. This means you cannot submit the
+following requests directly to any backing index except the write index:
+
+* An <<docs-index_,Index API>> request with an
+  <<docs-index-api-op_type,`op_type`>> of `create`. The `op_type` parameter
+  defaults to `create` when adding new documents.
+* A <<docs-bulk,Bulk API>> request using a `create` action
+
+Because it's the only index capable of ingesting new documents, you cannot
+perform operations on a write index that might hinder indexing. These
+prohibited operations include:
+
+* <<indices-close,Closing the write index>>
+* <<indices-delete-index,Deleting the write index>>
+* <<freeze-index-api,Freezing the write index>>
+* <<indices-shrink-index,Shrinking the write index>>
+
+[discrete]
+[[data-streams-rollover]]
+=== Rollover
+
+When a data stream is created, one backing index is automatically created.
+Because this single index is also the most recently created backing index, it
+acts as the stream's write index.
+
+A <<indices-rollover-index,rollover>> creates a new backing index for a data
+stream. This new backing index becomes the stream's write index, replacing
+the current one, and increments the stream's generation.
+
+In most cases, we recommend using <<index-lifecycle-management,{ilm}
+({ilm-init})>> to automate rollovers for data streams. This lets you
+automatically roll over the current write index when it meets specified
+criteria, such as a maximum age or size.
+
+However, you can also use the <<indices-rollover-index,rollover API>> to
+manually perform a rollover. See <<manually-roll-over-a-data-stream>>.
+
+[discrete]
+[[data-streams-append-only]]
+=== Append-only
+
+For most time-series use cases, existing data is rarely, if ever, updated.
+Because of this, data streams are designed to be append-only. This means you can
+send indexing requests for new documents directly to a data stream. However, you
+cannot send update or deletion requests for existing documents to a data stream.
+
+To update or delete specific documents in a data stream, submit one of the
+following requests to the backing index containing the document:
+
+* An <<docs-index_,Index API>> request with an
+  <<docs-index-api-op_type,`op_type`>> of `index`.
+  These requests must include valid <<optimistic-concurrency-control,`if_seq_no`
+  and `if_primary_term`>> arguments.
+
+* A <<docs-bulk,Bulk API>> request using the `delete`, `index`, or `update`
+  action. If the action type is `index`, the action must include valid
+  <<bulk-optimistic-concurrency-control,`if_seq_no` and `if_primary_term`>>
+  arguments.
+
+* A <<docs-delete,Delete API>> request
+
+TIP: If you need to frequently update or delete existing documents across
+multiple indices, we recommend using an <<indices-add-alias,index alias>> and
+<<indices-templates,index template>> instead of a data stream. You can still
+use <<index-lifecycle-management,{ilm-init}>> to manage the indices.

+ 37 - 44
docs/reference/data-streams/data-streams.asciidoc

@@ -1,67 +1,60 @@
 [[data-streams]]
 = Data streams
+++++
+<titleabbrev>Data streams</titleabbrev>
+++++
 
-[partintro]
---
-You can use data streams to index time-based data that's continuously generated.
-A data stream groups indices from the same time-based data source.
-A data stream tracks its indices, known as _backing indices_, using an ordered
-list.
+A _data stream_ is a convenient, scalable way to ingest, search, and manage
+continuously generated time-series data.
 
-A data stream's backing indices are <<index-hidden,hidden>>.
-While all backing indices handle read requests, the most recently created
-backing index is the data stream's only write index.  A data stream only
-accepts <<docs-index_,index requests>> with `op_type` set to `create`. To update
-or delete specific documents in a data stream, submit a <<docs-delete,delete>>
-or <<docs-update,update>> API request to the backing index containing the
-document.
+Time-series data, such as logs, tends to grow over time. While storing an entire
+time series in a single {es} index is simpler, it is often more efficient and
+cost-effective to store large volumes of data across multiple, time-based
+indices. Multiple indices let you move indices containing older, less frequently
+queried data to less expensive hardware and delete indices when they're no
+longer needed, reducing overhead and storage costs.
 
-To create a data stream, set up a <<indices-templates,composable index
-template>> containing:
+A data stream is designed to give you the best of both worlds:
 
-* A name or wildcard pattern for the data stream in the `index_patterns` property.
-* A `data_stream` definition that contains the `timestamp_field` property.
-  The `timestamp_field` must be the primary timestamp field
-   for the data source. This field must be included in every
-   document indexed to the data stream.
+* The simplicity of a single, named resource you can use for requests
+  related
+* The storage, scalability, and cost-saving benefits of multiple indices
 
-When you index one or more documents to a not-yet-existent target matching
-the template's name or pattern, {es} automatically creates the corresponding
-data stream. You can also manually create a data stream using the
-<<indices-create-data-stream,create data stream API>>. However, a composable
-template for the stream is still required.
+You can submit indexing and search requests directly to a data stream. The
+stream automatically routes the requests to a collection of hidden,
+auto-generated indices that store the stream's data.
 
-You can use the <<indices-rollover-index,rollover API>> to roll a data stream
-over to a new index when the current write index meets specified criteria, such
-as a maximum age or size. A rollover creates a new backing index and updates the
-data stream's list of backing indices. This new index then becomes the stream's
-new write index. See <<rollover-data-stream-ex>>.
+You can use a <<indices-templates,composable template>> and
+<<index-lifecycle-management,{ilm} ({ilm-init})>> to automate the management of
+these hidden indices. You can use {ilm-init} to spin up new indices, allocate
+indices to different hardware, delete old indices, and take other automatic
+actions based on age or size criteria you set. This lets you seamlessly scale
+your data storage based on your budget, performance, resiliency, and retention
+needs.
 
-Backing indices are generated with the naming convention
-`.ds-<data-stream-name>-zzzzzz`, where `zzzzzz` is the six-digit, zero-padded
-generation of the data stream. For example, a data stream named
-`web-server-logs` with a generation of 34 would have a write index named
-`.ds-web-server-logs-000034`. Data streams may have backing indices that do not
-conform to this naming convention if operations such as
-<<indices-shrink-index,shrink>> have been performed on them.
 
 [discrete]
-[[data-streams-apis]]
-== Data stream APIs
+[[when-to-use-data-streams]]
+== When to use data streams
 
-The following APIs are available for managing data streams:
+We recommend using data streams if you:
+
+* Use {es} to ingest, search, and manage large volumes of time-series data
+* Want to scale and reduce costs by using {ilm-init} to automate the management
+  of your indices
+* Index large volumes of time-series data in {es} but rarely delete or update
+  individual documents
 
-* To get information about data streams, use the <<indices-get-data-stream, get data stream API>>.
-* To delete data streams, use the <<indices-delete-data-stream, delete data stream API>>.
-* To manually create a data stream, use the <<indices-create-data-stream, create data stream API>>.
 
 [discrete]
 [[data-streams-toc]]
 == In this section
 
+* <<data-streams-overview>>
 * <<set-up-a-data-stream>>
 * <<use-a-data-stream>>
---
 
+
+include::data-streams-overview.asciidoc[]
 include::set-up-a-data-stream.asciidoc[]
 include::use-a-data-stream.asciidoc[]

+ 8 - 7
docs/reference/data-streams/set-up-a-data-stream.asciidoc

@@ -22,11 +22,11 @@ TIP: Data streams work well with most common log formats. While no schema is
 required to use data streams, we recommend the {ecs-ref}[Elastic Common Schema
 (ECS)].
 
-* Data streams are designed to be append-only. While you can index new documents
-directly to a data stream, you cannot use a data stream to directly update or
-delete individual documents. To update or delete specific documents in a data
-stream, submit a <<docs-delete,delete>> or <<docs-update,update>> API request to
-the backing index containing the document.
+* Data streams are designed to be <<data-streams-append-only,append-only>>.
+While you can index new documents directly to a data stream, you cannot use a
+data stream to directly update or delete individual documents. To update or
+delete specific documents in a data stream, submit a <<docs-delete,delete>> or
+<<docs-update,update>> API request to the backing index containing the document.
 
 
 [discrete]
@@ -57,8 +57,9 @@ The following <<ilm-put-lifecycle,create lifecycle policy API>> request
 configures the `logs_policy` lifecycle policy.
 
 The `logs_policy` policy uses the <<ilm-rollover,`rollover` action>> to create a
-new write index for the data stream when the current one reaches 25GB in size.
-The policy also deletes backing indices 30 days after their rollover.
+new <<data-stream-write-index,write index>> for the data stream when the current
+one reaches 25GB in size. The policy also deletes backing indices 30 days after
+their rollover.
 
 [source,console]
 ----

+ 2 - 1
docs/reference/data-streams/use-a-data-stream.asciidoc

@@ -144,7 +144,8 @@ GET /logs/_search
 === Manually roll over a data stream
 
 A rollover creates a new backing index for a data stream. This new backing index
-becomes the stream's new write index and increments the stream's generation.
+becomes the stream's <<data-stream-write-index,write index>> and increments
+the stream's <<data-streams-generation,generation>>.
 
 In most cases, we recommend using <<index-lifecycle-management,{ilm-init}>> to
 automate rollovers for data streams. This lets you automatically roll over the

Fișier diff suprimat deoarece este prea mare
+ 0 - 0
docs/reference/images/data-streams/data-streams-diagram.svg


Fișier diff suprimat deoarece este prea mare
+ 0 - 0
docs/reference/images/data-streams/data-streams-index-request.svg


Fișier diff suprimat deoarece este prea mare
+ 0 - 0
docs/reference/images/data-streams/data-streams-search-request.svg


+ 1 - 14
docs/reference/indices.asciidoc

@@ -2,7 +2,7 @@
 == Index APIs
 
 Index APIs are used to manage individual indices,
-index settings, data streams, aliases, mappings, and index templates.
+index settings, aliases, mappings, and index templates.
 
 [float]
 [[index-management]]
@@ -30,13 +30,6 @@ index settings, data streams, aliases, mappings, and index templates.
 * <<indices-get-mapping>>
 * <<indices-get-field-mapping>>
 
-[float]
-[[data-stream-management]]
-=== Data stream management:
-* <<indices-create-data-stream>>
-* <<indices-delete-data-stream>>
-* <<indices-get-data-stream>>
-
 [float]
 [[alias-management]]
 === Alias management:
@@ -159,9 +152,3 @@ include::indices/apis/unfreeze.asciidoc[]
 include::indices/aliases.asciidoc[]
 
 include::indices/update-settings.asciidoc[]
-
-include::indices/create-data-stream.asciidoc[]
-
-include::indices/get-data-stream.asciidoc[]
-
-include::indices/delete-data-stream.asciidoc[]

+ 1 - 0
docs/reference/rest-api/index.asciidoc

@@ -47,6 +47,7 @@ endif::[]
 include::{es-repo-dir}/cat.asciidoc[]
 include::{es-repo-dir}/cluster.asciidoc[]
 include::{es-repo-dir}/ccr/apis/ccr-apis.asciidoc[]
+include::{es-repo-dir}/data-streams/data-stream-apis.asciidoc[]
 include::{es-repo-dir}/docs.asciidoc[]
 include::{es-repo-dir}/ingest/apis/enrich/index.asciidoc[]
 include::{es-repo-dir}/graph/explore.asciidoc[]

Unele fișiere nu au fost afișate deoarece prea multe fișiere au fost modificate în acest diff