Browse Source

[DOCS] Add how-to guide for time series data (#71195)

James Rodewig 4 years ago
parent
commit
c23f001151

+ 10 - 3
docs/reference/data-streams/set-up-a-data-stream.asciidoc

@@ -29,6 +29,7 @@ To create an index lifecycle policy in {kib}, open the main menu and go to
 
 You can also use the <<ilm-put-lifecycle,create lifecycle policy API>>.
 
+// tag::ilm-policy-api-ex[]
 [source,console]
 ----
 PUT _ilm/policy/my-lifecycle-policy
@@ -38,7 +39,6 @@ PUT _ilm/policy/my-lifecycle-policy
       "hot": {
         "actions": {
           "rollover": {
-            "max_age": "30d",
             "max_primary_shard_size": "50gb"
           }
         }
@@ -58,7 +58,7 @@ PUT _ilm/policy/my-lifecycle-policy
         "min_age": "60d",
         "actions": {
           "searchable_snapshot": {
-            "snapshot_repository": "my-snapshot-repo"
+            "snapshot_repository": "found-snapshots"
           }
         }
       },
@@ -66,7 +66,7 @@ PUT _ilm/policy/my-lifecycle-policy
         "min_age": "90d",
         "actions": {
           "searchable_snapshot": {
-            "snapshot_repository": "my-snapshot-repo"
+            "snapshot_repository": "found-snapshots"
           }
         }
       },
@@ -80,11 +80,13 @@ PUT _ilm/policy/my-lifecycle-policy
   }
 }
 ----
+// end::ilm-policy-api-ex[]
 
 [discrete]
 [[create-component-templates]]
 === Step 2. Create component templates
 
+// tag::ds-create-component-templates[]
 A data stream requires a matching index template. In most cases, you compose
 this index template using one or more component templates. You typically use
 separate component templates for mappings and index settings. This lets you
@@ -156,11 +158,13 @@ PUT _component_template/my-settings
 }
 ----
 // TEST[continued]
+// end::ds-create-component-templates[]
 
 [discrete]
 [[create-index-template]]
 === Step 3. Create an index template
 
+// tag::ds-create-index-template[]
 Use your component templates to create an index template. Specify:
 
 * One or more index patterns that match the data stream's name. We recommend
@@ -196,11 +200,13 @@ PUT _index_template/my-index-template
 }
 ----
 // TEST[continued]
+// end::ds-create-index-template[]
 
 [discrete]
 [[create-data-stream]]
 === Step 4. Create the data stream
 
+// tag::ds-create-data-stream[]
 <<add-documents-to-a-data-stream,Indexing requests>> add documents to a data
 stream. These requests must use an `op_type` of `create`. Documents must include
 a `@timestamp` field.
@@ -224,6 +230,7 @@ POST my-data-stream/_doc
 }
 ----
 // TEST[continued]
+// end::ds-create-data-stream[]
 
 You can also manually create the stream using the
 <<indices-create-data-stream,create data stream API>>. The stream's name must

+ 3 - 1
docs/reference/how-to.asciidoc

@@ -25,4 +25,6 @@ include::how-to/search-speed.asciidoc[]
 
 include::how-to/disk-usage.asciidoc[]
 
-include::how-to/size-your-shards.asciidoc[]
+include::how-to/size-your-shards.asciidoc[]
+
+include::how-to/use-elasticsearch-for-time-series-data.asciidoc[]

+ 213 - 0
docs/reference/how-to/use-elasticsearch-for-time-series-data.asciidoc

@@ -0,0 +1,213 @@
+[[use-elasticsearch-for-time-series-data]]
+== Use {es} for time series data
+
+{es} offers features to help you store, manage, and search time series data,
+such as logs and metrics. Once in {es}, you can analyze and visualize your data
+using {kib} and other {stack} features.
+
+To get the most out of your time series data in {es}, follow these steps:
+
+* <<set-up-data-tiers>>
+* <<register-snapshot-repository>>
+* <<create-edit-index-lifecycle-policy>>
+* <<create-ts-component-templates>>
+* <<create-ts-index-template>>
+* <<add-data-to-data-stream>>
+* <<search-visualize-your-data>>
+
+
+[discrete]
+[[set-up-data-tiers]]
+=== Step 1. Set up data tiers
+
+{es}'s <<index-lifecycle-management,{ilm-init}>> feature uses <<data-tiers,data
+tiers>> to automatically move older data to nodes with less expensive hardware
+as it ages. This helps improve performance and reduce storage costs.
+
+The hot tier is required. The warm, cold, and frozen tiers are optional. Use
+high-performance nodes in the hot and warm tiers for faster indexing and faster
+searches on your most recent data. Use slower, less expensive nodes in the cold
+and frozen tiers to reduce costs.
+
+The steps for setting up data tiers vary based on your deployment type:
+
+include::{es-repo-dir}/tab-widgets/code.asciidoc[]
+include::{es-repo-dir}/tab-widgets/data-tiers-widget.asciidoc[]
+
+[discrete]
+[[register-snapshot-repository]]
+=== Step 2. Register a snapshot repository
+
+The cold and frozen tiers can use <<searchable-snapshots,{search-snaps}>> to
+reduce local storage costs.
+
+To use {search-snaps}, you must register a supported snapshot repository. The
+steps for registering this repository vary based on your deployment type and
+storage provider:
+
+include::{es-repo-dir}/tab-widgets/snapshot-repo-widget.asciidoc[]
+
+[discrete]
+[[create-edit-index-lifecycle-policy]]
+=== Step 3. Create or edit an index lifecycle policy
+
+A <<data-streams,data stream>> stores your data across multiple backing
+indices. {ilm-init} uses an <<ilm-index-lifecycle,index lifecycle policy>> to
+automatically move these indices through your data tiers.
+
+If you use {fleet} or {agent}, edit one of {es}'s built-in lifecycle policies.
+If you use a custom application, create your own policy. In either case,
+ensure your policy:
+
+* Includes a phase for each data tier you've configured.
+* Calculates the threshold, or `min_age`, for phase transition from rollover.
+* Uses {search-snaps} in the cold and frozen phases, if wanted.
+* Includes a delete phase, if needed.
+
+include::{es-repo-dir}/tab-widgets/ilm-widget.asciidoc[]
+
+[discrete]
+[[create-ts-component-templates]]
+=== Step 4. Create component templates
+
+TIP: If you use {fleet} or {agent}, skip to <<search-visualize-your-data>>.
+{fleet} and {agent} use built-in templates to create data streams for you.
+
+If you use a custom application, you need to set up your own data stream.
+include::{es-repo-dir}/data-streams/set-up-a-data-stream.asciidoc[tag=ds-create-component-templates]
+
+[discrete]
+[[create-ts-index-template]]
+=== Step 5. Create an index template
+
+include::{es-repo-dir}/data-streams/set-up-a-data-stream.asciidoc[tag=ds-create-index-template]
+
+[discrete]
+[[add-data-to-data-stream]]
+=== Step 6. Add data to a data stream
+
+include::{es-repo-dir}/data-streams/set-up-a-data-stream.asciidoc[tag=ds-create-data-stream]
+
+[discrete]
+[[search-visualize-your-data]]
+=== Step 7. Search and visualize your data
+
+To explore and search your data in {kib}, open the main menu and select
+**Discover**. See {kib}'s {kibana-ref}/discover.html[Discover documentation].
+
+Use {kib}'s **Dashboard** feature to visualize your data in a chart, table, map,
+and more. See {kib}'s {kibana-ref}/dashboard.html[Dashboard documentation].
+
+You can also search and aggregate your data using the <<search-search,search
+API>>. Use <<runtime-search-request,runtime fields>> and <<grok-basics,grok
+patterns>> to dynamically extract data from log messages and other unstructured
+content at search time.
+
+[source,console]
+----
+GET my-data-stream/_search
+{
+  "runtime_mappings": {
+    "source.ip": {
+      "type": "ip",
+      "script": """
+        String sourceip=grok('%{IPORHOST:sourceip} .*').extract(doc[ "message" ].value)?.sourceip;
+        if (sourceip != null) emit(sourceip);
+      """
+    }
+  },
+  "query": {
+    "bool": {
+      "filter": [
+        {
+          "range": {
+            "@timestamp": {
+              "gte": "now-1d/d",
+              "lt": "now/d"
+            }
+          }
+        },
+        {
+          "range": {
+            "source.ip": {
+              "gte": "192.0.2.0",
+              "lte": "192.0.2.255"
+            }
+          }
+        }
+      ]
+    }
+  },
+  "fields": [
+    "*"
+  ],
+  "_source": false,
+  "sort": [
+    {
+      "@timestamp": "desc"
+    },
+    {
+      "source.ip": "desc"
+    }
+  ]
+}
+----
+// TEST[setup:my_data_stream]
+// TEST[teardown:data_stream_cleanup]
+
+{es} searches are synchronous by default. Searches across frozen data, long time
+ranges, or large datasets may take longer. Use the <<submit-async-search,async
+search API>> to run searches in the background. For more search options, see
+<<search-your-data>>.
+
+[source,console]
+----
+POST my-data-stream/_async_search
+{
+  "runtime_mappings": {
+    "source.ip": {
+      "type": "ip",
+      "script": """
+        String sourceip=grok('%{IPORHOST:sourceip} .*').extract(doc[ "message" ].value)?.sourceip;
+        if (sourceip != null) emit(sourceip);
+      """
+    }
+  },
+  "query": {
+    "bool": {
+      "filter": [
+        {
+          "range": {
+            "@timestamp": {
+              "gte": "now-2y/d",
+              "lt": "now/d"
+            }
+          }
+        },
+        {
+          "range": {
+            "source.ip": {
+              "gte": "192.0.2.0",
+              "lte": "192.0.2.255"
+            }
+          }
+        }
+      ]
+    }
+  },
+  "fields": [
+    "*"
+  ],
+  "_source": false,
+  "sort": [
+    {
+      "@timestamp": "desc"
+    },
+    {
+      "source.ip": "desc"
+    }
+  ]
+}
+----
+// TEST[setup:my_data_stream]
+// TEST[teardown:data_stream_cleanup]

+ 8 - 30
docs/reference/searchable-snapshots/index.asciidoc

@@ -72,7 +72,8 @@ For more complex or time-consuming searches, you can use <<async-search>> with
 ====
 
 [[searchable-snapshots-repository-types]]
-You can use any of the following repository types with searchable snapshots:
+// tag::searchable-snapshot-repo-types[]
+Use any of the following repository types with searchable snapshots:
 
 * {plugins}/repository-s3.html[AWS S3]
 * {plugins}/repository-gcs.html[Google Cloud Storage]
@@ -83,8 +84,9 @@ You can use any of the following repository types with searchable snapshots:
 You can also use alternative implementations of these repository types, for
 instance
 {plugins}/repository-s3-client.html#repository-s3-compatible-services[Minio],
-as long as they are fully compatible. You can use the <<repo-analysis-api>> API
+as long as they are fully compatible. Use the <<repo-analysis-api>> API
 to analyze your repository's suitability for use with searchable snapshots.
+// end::searchable-snapshot-repo-types[]
 
 [discrete]
 [[how-searchable-snapshots-work]]
@@ -219,31 +221,7 @@ repository storage then you are responsible for its reliability.
 [[searchable-snapshots-frozen-tier-on-cloud]]
 === Configure a frozen tier on {ess}
 
-The frozen data tier is not yet available on {ess-trial}[{ess}]. However,
-you can configure another tier to use <<shared-cache,shared snapshot caches>>.
-This effectively recreates a frozen tier in your deployment. Follow these
-steps:
-
-. Choose an existing tier to use. Typically, you'll use the cold tier, but the
-hot and warm tiers are also supported. You can use this tier as a shared tier,
-or you can dedicate the tier exclusively to shared snapshot caches.
-
-. Log in to the {ess-trial}[{ess} Console].
-
-. Select your deployment from the {ess} home page or the deployments page.
-
-. From your deployment menu, select **Edit deployment**.
-
-. On the **Edit** page, click **Edit elasticsearch.yml** under your selected
-{es} tier.
-
-. In the `elasticsearch.yml` file, add the
-<<searchable-snapshots-shared-cache,`xpack.searchable.snapshot.shared_cache.size`>>
-setting. For example:
-+
-[source,yaml]
-----
-xpack.searchable.snapshot.shared_cache.size: 50GB
-----
-
-. Click **Save** and **Confirm** to apply your configuration changes.
+The frozen data tier is not yet available on {ess-trial}[{ess}]. However, you
+can configure another tier to use <<shared-cache,shared snapshot caches>>. This
+effectively recreates a frozen tier in your deployment. See
+<<set-up-data-tiers,Set up data tiers>>.

+ 40 - 0
docs/reference/tab-widgets/data-tiers-widget.asciidoc

@@ -0,0 +1,40 @@
+++++
+<div class="tabs" data-tab-group="host">
+  <div role="tablist" aria-label="Data tiers configuration">
+    <button role="tab"
+            aria-selected="true"
+            aria-controls="cloud-tab"
+            id="cloud">
+      Elasticsearch Service
+    </button>
+    <button role="tab"
+            aria-selected="false"
+            aria-controls="self-managed-tab"
+            id="self-managed"
+            tabindex="-1">
+      Self-managed
+    </button>
+  </div>
+  <div tabindex="0"
+       role="tabpanel"
+       id="cloud-tab"
+       aria-labelledby="cloud">
+++++
+
+include::data-tiers.asciidoc[tag=cloud]
+
+++++
+  </div>
+  <div tabindex="0"
+       role="tabpanel"
+       id="self-managed-tab"
+       aria-labelledby="self-managed"
+       hidden="">
+++++
+
+include::data-tiers.asciidoc[tag=self-managed]
+
+++++
+  </div>
+</div>
+++++

+ 90 - 0
docs/reference/tab-widgets/data-tiers.asciidoc

@@ -0,0 +1,90 @@
+// tag::cloud[]
+. Log in to the {ess-trial}[{ess} Console].
+
+. Add or select your deployment from the {ess} home page or the deployments
+page.
+
+. From your deployment menu, select **Edit deployment**.
+
+. To enable a data tier, click **Add capacity**.
+
+experimental:[] **Frozen tier** 
+
+The frozen tier is not yet available on {ess}. However, you can follow these
+steps to effectively recreate a frozen tier in your deployment:
+
+. Choose an existing tier to use. You'll typically use the cold tier, but the
+hot and warm tiers are also supported. You can use this tier as a shared tier,
+or use it exclusively as a frozen tier.
+
+. On the **Edit deployment** page, click **Edit elasticsearch.yml** for your
+chosen tier.
+
+. In the `elasticsearch.yml` configuration, set
+<<searchable-snapshots-shared-cache,`xpack.searchable.snapshot.shared_cache.size`>>
+to up to 90% of available disk space. The tier uses this space to
+create a <<shared-cache,shared, fixed-size cache>> for
+<<searchable-snapshots,searchable snapshots>>.
++
+[source,yaml]
+----
+xpack.searchable.snapshot.shared_cache.size: 50GB
+----
+
+. Click **Save** and **Confirm** to apply your changes.
+
+**Enable autoscaling**
+
+{cloud}/ec-autoscaling.html[Autoscaling] automatically adjusts your deployment's
+capacity to meet your storage needs. To enable autoscaling, select **Autoscale
+this deployment** on the **Edit deployment** page. Autoscaling is only available
+for {ess}.
+// end::cloud[]
+
+// tag::self-managed[]
+To assign a node to a data tier, add the respective <<node-roles,node role>> to
+the node's `elasticsearch.yml` file. Changing an existing node's roles requires
+a <<restart-cluster-rolling,rolling restart>>.
+
+[source,yaml]
+----
+# Hot tier
+node.roles: [ data_hot ]
+
+# Warm tier
+node.roles: [ data_warm ]
+
+# Cold tier
+node.roles: [ data_cold ]
+
+# Frozen tier
+node.roles: [ data_frozen ]
+----
+
+experimental:[] For nodes in the frozen tier, set
+<<searchable-snapshots-shared-cache,`xpack.searchable.snapshot.shared_cache.size`>>
+to up to 90% of the node's available disk space. The frozen tier uses this space
+to create a <<shared-cache,shared, fixed-size cache>> for
+<<searchable-snapshots,searchable snapshots>>.
+
+[source,yaml]
+----
+node.roles: [ data_frozen ]
+xpack.searchable.snapshot.shared_cache.size: 50GB
+----
+
+If needed, you can assign a node to more than one tier.
+
+[source,yaml]
+----
+node.roles: [ data_hot, data_warm ]
+----
+
+Assign your nodes any other roles needed for your cluster. For example, a small
+cluster may have nodes with multiple roles.
+
+[source,yaml]
+----
+node.roles: [ master, ingest, ml, data_hot, transform ]
+----
+// end::self-managed[]

+ 40 - 0
docs/reference/tab-widgets/ilm-widget.asciidoc

@@ -0,0 +1,40 @@
+++++
+<div class="tabs" data-tab-group="ingest">
+  <div role="tablist" aria-label="Index lifecycle policy configuration">
+    <button role="tab"
+            aria-selected="true"
+            aria-controls="fleet-tab"
+            id="fleet">
+      Fleet or Elastic Agent
+    </button>
+    <button role="tab"
+            aria-selected="false"
+            aria-controls="custom-policy-tab"
+            id="custom"
+            tabindex="-1">
+      Custom application
+    </button>
+  </div>
+  <div tabindex="0"
+       role="tabpanel"
+       id="fleet-tab"
+       aria-labelledby="fleet">
+++++
+
+include::ilm.asciidoc[tag=fleet]
+
+++++
+  </div>
+  <div tabindex="0"
+       role="tabpanel"
+       id="custom-policy-tab"
+       aria-labelledby="custom"
+       hidden="">
+++++
+
+include::ilm.asciidoc[tag=custom]
+
+++++
+  </div>
+</div>
+++++

+ 75 - 0
docs/reference/tab-widgets/ilm.asciidoc

@@ -0,0 +1,75 @@
+// tag::fleet[]
+{fleet} and {agent} use the following built-in lifecycle policies:
+
+* `logs`
+* `metrics`
+* `synthetics`
+
+You can customize these policies based on your performance, resilience, and
+retention requirements.
+
+To edit a policy in {kib}, open the main menu and go to **Stack Management >
+Index Lifecycle Policies**. Click the policy you'd like to edit.
+
+You can also use the <<ilm-put-lifecycle,update lifecycle policy API>>.
+
+[source,console]
+----
+PUT _ilm/policy/logs
+{
+  "policy": {
+    "phases": {
+      "hot": {
+        "actions": {
+          "rollover": {
+            "max_primary_shard_size": "50gb"
+          }
+        }
+      },
+      "warm": {
+        "min_age": "30d",
+        "actions": {
+          "shrink": {
+            "number_of_shards": 1
+          },
+          "forcemerge": {
+            "max_num_segments": 1
+          }
+        }
+      },
+      "cold": {
+        "min_age": "60d",
+        "actions": {
+          "searchable_snapshot": {
+            "snapshot_repository": "found-snapshots"
+          }
+        }
+      },
+      "frozen": {
+        "min_age": "90d",
+        "actions": {
+          "searchable_snapshot": {
+            "snapshot_repository": "found-snapshots"
+          }
+        }
+      },
+      "delete": {
+        "min_age": "735d",
+        "actions": {
+          "delete": {}
+        }
+      }
+    }
+  }
+}
+----
+// end::fleet[]
+
+// tag::custom[]
+To create a policy in {kib}, open the main menu and go to **Stack Management >
+Index Lifecycle Policies**. Click **Create policy**.
+
+You can also use the <<ilm-put-lifecycle,update lifecycle policy API>>.
+
+include::{es-repo-dir}/data-streams/set-up-a-data-stream.asciidoc[tag=ilm-policy-api-ex]
+// end::custom[]

+ 40 - 0
docs/reference/tab-widgets/snapshot-repo-widget.asciidoc

@@ -0,0 +1,40 @@
+++++
+<div class="tabs" data-tab-group="host">
+  <div role="tablist" aria-label="Snapshot repository">
+    <button role="tab"
+            aria-selected="true"
+            aria-controls="cloud-tab-repo"
+            id="cloud-repo">
+      Elasticsearch Service
+    </button>
+    <button role="tab"
+            aria-selected="false"
+            aria-controls="self-managed-tab-repo"
+            id="self-managed-repo"
+            tabindex="-1">
+      Self-managed
+    </button>
+  </div>
+  <div tabindex="0"
+       role="tabpanel"
+       id="cloud-tab-repo"
+       aria-labelledby="cloud-repo">
+++++
+
+include::snapshot-repo.asciidoc[tag=cloud]
+
+++++
+  </div>
+  <div tabindex="0"
+       role="tabpanel"
+       id="self-managed-tab-repo"
+       aria-labelledby="self-managed-repo"
+       hidden="">
+++++
+
+include::snapshot-repo.asciidoc[tag=self-managed]
+
+++++
+  </div>
+</div>
+++++

+ 20 - 0
docs/reference/tab-widgets/snapshot-repo.asciidoc

@@ -0,0 +1,20 @@
+// tag::cloud[]
+When you create a cluster, {ess} automatically registers a default
+{cloud}/ec-snapshot-restore.html[`found-snapshots`] repository. This repository
+supports {search-snaps}.
+
+The `found-snapshots` repository is specific to your cluster. To use another
+cluster's default repository, see
+{cloud}/ec_share_a_repository_across_clusters.html[Share a repository across
+clusters].
+
+You can also use any of the following custom repository types with {search-snaps}:
+
+* {cloud}/ec-gcs-snapshotting.html[Google Cloud Storage (GCS)]
+* {cloud}/ec-azure-snapshotting.html[Azure Blob Storage]
+* {cloud}/ec-aws-custom-repository.html[Amazon Web Services (AWS)]
+// end::cloud[]
+
+// tag::self-managed[]
+include::{es-repo-dir}/searchable-snapshots/index.asciidoc[tag=searchable-snapshot-repo-types]
+// end::self-managed[]