1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162 |
- [role="xpack"]
- [[data-stream-lifecycle]]
- == Data stream lifecycle
- A data stream lifecycle is the built-in mechanism data streams use to manage their lifecycle. It enables you to easily
- automate the management of your data streams according to your retention requirements. For example, you could configure
- the lifecycle to:
- * Ensure that data indexed in the data stream will be kept at least for the retention time you defined.
- * Ensure that data older than the retention period will be deleted automatically by {es} at a later time.
- To achieve that, it supports:
- * Automatic <<index-rollover,rollover>>, which chunks your incoming data in smaller pieces to facilitate better performance
- and backwards incompatible mapping changes.
- * Configurable retention, which allows you to configure the time period for which your data is guaranteed to be stored.
- {es} is allowed at a later time to delete data older than this time period.
- [discrete]
- [[data-streams-lifecycle-how-it-works]]
- === How does it work?
- In intervals configured by <<data-streams-lifecycle-poll-interval,`data_streams.lifecycle.poll_interval`>>, {es} goes over
- each data stream and performs the following steps:
- 1. Checks if the data stream has a data stream lifecycle configured, skipping any indices not part of a managed data stream.
- 2. Rolls over the write index of the data stream, if it fulfills the conditions defined by
- <<cluster-lifecycle-default-rollover,`cluster.lifecycle.default.rollover`>>.
- 3. Applies retention to the remaining backing indices. This means deleting the backing indices whose
- `generation_time` is longer than the configured retention period. The `generation_time` is only applicable to rolled over backing
- indices and it is either the time since the backing index got rolled over, or the time optionally configured in the
- <<index-data-stream-lifecycle-origination-date,`index.lifecycle.origination_date`>> setting.
- IMPORTANT: We use the `generation_time` instead of the creation time because this ensures that all data in the backing
- index have passed the retention period. As a result, the retention period is not the exact time data gets deleted, but
- the minimum time data will be stored.
- NOTE: The steps `2` and `3` apply only to backing indices that are not already managed by {ilm-init}, meaning that these indices either do
- not have an {ilm-init} policy defined, or if they do, they have <<index-lifecycle-prefer-ilm,`index.lifecycle.prefer_ilm`>>
- set to `false`.
- [discrete]
- [[data-stream-lifecycle-configuration]]
- === Configuring data stream lifecycle
- Since the lifecycle is configured on the data stream level, the process to configure a lifecycle on a new data stream and
- on an existing one differ.
- In the following sections, we will go through the following tutorials:
- * To create a new data stream with a lifecycle, you need to add the data stream lifecycle as part of the index template
- that matches the name of your data stream (see <<tutorial-manage-new-data-stream>>). When a write operation
- with the name of your data stream reaches {es} then the data stream will be created with the respective data stream lifecycle.
- * To update the lifecycle of an existing data stream you need to use the <<data-stream-lifecycle-api, data stream lifecycle APIs>>
- to edit the lifecycle on the data stream itself (see <<tutorial-manage-existing-data-stream>>).
- NOTE: Updating the data stream lifecycle of an existing data stream is different from updating the settings or the mapping,
- because it is applied on the data stream level and not on the individual backing indices.
- include::tutorial-manage-new-data-stream.asciidoc[]
- include::tutorial-manage-existing-data-stream.asciidoc[]
|