set-up-a-data-stream.asciidoc 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389
  1. [role="xpack"]
  2. [[set-up-a-data-stream]]
  3. == Set up a data stream
  4. To set up a data stream, follow these steps:
  5. . Check the <<data-stream-prereqs, prerequisites>>.
  6. . <<configure-a-data-stream-ilm-policy>>.
  7. . <<create-a-data-stream-template>>.
  8. . <<create-a-data-stream>>.
  9. . <<get-info-about-a-data-stream>> to verify it exists.
  10. . <<secure-a-data-stream>>.
  11. After you set up a data stream, you can <<use-a-data-stream, use the data
  12. stream>> for indexing, searches, and other supported operations.
  13. If you no longer need it, you can <<delete-a-data-stream,delete a data stream>>
  14. and its backing indices.
  15. [discrete]
  16. [[data-stream-prereqs]]
  17. === Prerequisites
  18. * {es} data streams are intended for time-series data only. Each document
  19. indexed to a data stream must contain the `@timestamp` field. This field must be
  20. mapped as a <<date,`date`>> or <<date_nanos,`date_nanos`>> field data type.
  21. * Data streams are best suited for time-based,
  22. <<data-streams-append-only,append-only>> use cases. If you frequently need to
  23. update or delete existing documents, we recommend using an index alias and an
  24. index template instead.
  25. [discrete]
  26. [[configure-a-data-stream-ilm-policy]]
  27. === Optional: Configure an {ilm-init} lifecycle policy for a data stream
  28. You can use <<index-lifecycle-management,{ilm} ({ilm-init})>> to automatically
  29. manage a data stream's backing indices. For example, you could use {ilm-init}
  30. to:
  31. * Spin up a new write index for the data stream when the current one reaches a
  32. certain size or age.
  33. * Move older backing indices to slower, less expensive hardware.
  34. * Delete stale backing indices to enforce data retention standards.
  35. To use {ilm-init} with a data stream, you must
  36. <<set-up-lifecycle-policy,configure a lifecycle policy>>. This lifecycle policy
  37. should contain the automated actions to take on backing indices and the
  38. triggers for such actions.
  39. TIP: While optional, we recommend using {ilm-init} to scale data streams in
  40. production.
  41. .*Example*
  42. [%collapsible]
  43. ====
  44. The following <<ilm-put-lifecycle,create lifecycle policy API>> request
  45. configures the `logs_policy` lifecycle policy.
  46. The `logs_policy` policy uses the <<ilm-rollover,`rollover` action>> to create a
  47. new <<data-stream-write-index,write index>> for the data stream when the current
  48. one reaches 25GB in size. The policy also deletes backing indices 30 days after
  49. their rollover.
  50. [source,console]
  51. ----
  52. PUT /_ilm/policy/logs_policy
  53. {
  54. "policy": {
  55. "phases": {
  56. "hot": {
  57. "actions": {
  58. "rollover": {
  59. "max_size": "25GB"
  60. }
  61. }
  62. },
  63. "delete": {
  64. "min_age": "30d",
  65. "actions": {
  66. "delete": {}
  67. }
  68. }
  69. }
  70. }
  71. }
  72. ----
  73. ====
  74. [discrete]
  75. [[create-a-data-stream-template]]
  76. === Create an index template for a data stream
  77. Each data stream requires an <<indices-templates,index template>>. The data
  78. stream uses this template to create its backing indices.
  79. An index template for a data stream must contain:
  80. * A name or wildcard (`*`) pattern for the data stream in the `index_patterns`
  81. property.
  82. +
  83. You can use the resolve index API to check if the name or pattern
  84. matches any existing indices, index aliases, or data streams. If so, you should
  85. consider using another name or pattern.
  86. +
  87. .*Example*
  88. [%collapsible]
  89. ====
  90. The following resolve index API request checks for any existing indices, index
  91. aliases, or data streams that start with `logs`. If not, the `logs*`
  92. wildcard pattern can be used to create a new data stream.
  93. [source,console]
  94. ----
  95. GET /_resolve/index/logs*
  96. ----
  97. // TEST[continued]
  98. The API returns the following response, indicating no existing targets match
  99. this pattern.
  100. [source,console-result]
  101. ----
  102. {
  103. "indices" : [ ],
  104. "aliases" : [ ],
  105. "data_streams" : [ ]
  106. }
  107. ----
  108. ====
  109. * A `data_stream` definition containing `@timestamp` in the `timestamp_field`
  110. property. The `@timestamp` field must be included in every document indexed to
  111. the data stream.
  112. The template can also contain:
  113. * An optional field mapping for the `@timestamp` field. Both the <<date,`date`>> and
  114. <<date_nanos,`date_nanos`>> field data types are supported. If no mapping is specified,
  115. a <<date,`date`>> field data type with default options is used.
  116. +
  117. This mapping can include other <<mapping-params,mapping parameters>>, such as
  118. <<mapping-date-format,`format`>>.
  119. +
  120. IMPORTANT: Carefully consider the `@timestamp` field's mapping, including
  121. its <<mapping-params,mapping parameters>>.
  122. Once the stream is created, you can only update the `@timestamp` field's mapping
  123. by reindexing the data stream. See
  124. <<data-streams-use-reindex-to-change-mappings-settings>>.
  125. * If you intend to use {ilm-init}, the
  126. <<configure-a-data-stream-ilm-policy,lifecycle policy>> in the
  127. `index.lifecycle.name` setting.
  128. You can also specify other mappings and settings you'd like to apply to the
  129. stream's backing indices.
  130. TIP: We recommend you carefully consider which mappings and settings to include
  131. in this template before creating a data stream. Later changes to the mappings or
  132. settings of a stream's backing indices may require reindexing. See
  133. <<data-streams-change-mappings-and-settings>>.
  134. .*Example*
  135. [%collapsible]
  136. ====
  137. The following <<indices-templates,put index template API>> request
  138. configures the `logs_data_stream` template.
  139. [source,console]
  140. ----
  141. PUT /_index_template/logs_data_stream
  142. {
  143. "index_patterns": [ "logs*" ],
  144. "data_stream": {
  145. "timestamp_field": "@timestamp"
  146. },
  147. "template": {
  148. "settings": {
  149. "index.lifecycle.name": "logs_policy"
  150. }
  151. }
  152. }
  153. ----
  154. // TEST[continued]
  155. ====
  156. NOTE: You cannot delete an index template that's in use by a data stream.
  157. This would prevent the data stream from creating new backing indices.
  158. [discrete]
  159. [[create-a-data-stream]]
  160. === Create a data stream
  161. With an index template, you can create a data stream using one of two
  162. methods:
  163. * Submit an <<add-documents-to-a-data-stream,indexing request>> to a target
  164. matching the name or wildcard pattern defined in the template's `index_patterns`
  165. property.
  166. +
  167. --
  168. If the indexing request's target doesn't exist, {es} creates the data stream and
  169. uses the target name as the name for the stream.
  170. NOTE: Data streams support only specific types of indexing requests. See
  171. <<add-documents-to-a-data-stream>>.
  172. [[index-documents-to-create-a-data-stream]]
  173. .*Example: Index documents to create a data stream*
  174. [%collapsible]
  175. ====
  176. The following <<docs-index_,index API>> request targets `logs`, which matches
  177. the wildcard pattern for the `logs_data_stream` template. Because no existing
  178. index or data stream uses this name, this request creates the `logs` data stream
  179. and indexes the document to it.
  180. [source,console]
  181. ----
  182. POST /logs/_doc/
  183. {
  184. "@timestamp": "2020-12-06T11:04:05.000Z",
  185. "user": {
  186. "id": "vlb44hny"
  187. },
  188. "message": "Login attempt failed"
  189. }
  190. ----
  191. // TEST[continued]
  192. The API returns the following response. Note the `_index` property contains
  193. `.ds-logs-000001`, indicating the document was indexed to the write index of the
  194. new `logs` data stream.
  195. [source,console-result]
  196. ----
  197. {
  198. "_index": ".ds-logs-000001",
  199. "_id": "qecQmXIBT4jB8tq1nG0j",
  200. "_version": 1,
  201. "result": "created",
  202. "_shards": {
  203. "total": 2,
  204. "successful": 1,
  205. "failed": 0
  206. },
  207. "_seq_no": 0,
  208. "_primary_term": 1
  209. }
  210. ----
  211. // TESTRESPONSE[s/"_id": "qecQmXIBT4jB8tq1nG0j"/"_id": $body._id/]
  212. ====
  213. --
  214. * Use the <<indices-create-data-stream,create data stream API>> to manually
  215. create a data stream. The name of the data stream must match the
  216. name or wildcard pattern defined in the template's `index_patterns` property.
  217. +
  218. --
  219. .*Example: Manually create a data stream*
  220. [%collapsible]
  221. ====
  222. The following <<indices-create-data-stream,create data stream API>> request
  223. targets `logs_alt`, which matches the wildcard pattern for the
  224. `logs_data_stream` template. Because no existing index or data stream uses this
  225. name, this request creates the `logs_alt` data stream.
  226. [source,console]
  227. ----
  228. PUT /_data_stream/logs_alt
  229. ----
  230. // TEST[continued]
  231. ====
  232. --
  233. [discrete]
  234. [[get-info-about-a-data-stream]]
  235. === Get information about a data stream
  236. You can use the <<indices-get-data-stream,get data stream API>> to get
  237. information about one or more data streams, including:
  238. * The timestamp field
  239. * The current backing indices, which is returned as an array. The last item in
  240. the array contains information about the stream's current write index.
  241. * The current generation
  242. * The data stream's health status
  243. * The index template used to create the stream's backing indices
  244. * The current {ilm-init} lifecycle policy in the stream's matching index
  245. template
  246. This is also handy way to verify that a recently created data stream exists.
  247. .*Example*
  248. [%collapsible]
  249. ====
  250. The following get data stream API request retrieves information about the
  251. `logs` data stream.
  252. ////
  253. [source,console]
  254. ----
  255. POST /logs/_rollover/
  256. ----
  257. // TEST[continued]
  258. ////
  259. [source,console]
  260. ----
  261. GET /_data_stream/logs
  262. ----
  263. // TEST[continued]
  264. The API returns the following response. Note the `indices` property contains an
  265. array of the stream's current backing indices. The last item in this array
  266. contains information about the stream's write index, `.ds-logs-000002`.
  267. [source,console-result]
  268. ----
  269. {
  270. "data_streams": [
  271. {
  272. "name": "logs",
  273. "timestamp_field": {
  274. "name": "@timestamp"
  275. },
  276. "indices": [
  277. {
  278. "index_name": ".ds-logs-000001",
  279. "index_uuid": "krR78LfvTOe6gr5dj2_1xQ"
  280. },
  281. {
  282. "index_name": ".ds-logs-000002", <1>
  283. "index_uuid": "C6LWyNJHQWmA08aQGvqRkA"
  284. }
  285. ],
  286. "generation": 2,
  287. "status": "GREEN",
  288. "template": "logs_data_stream",
  289. "ilm_policy": "logs_policy"
  290. }
  291. ]
  292. }
  293. ----
  294. // TESTRESPONSE[s/"index_uuid": "krR78LfvTOe6gr5dj2_1xQ"/"index_uuid": $body.data_streams.0.indices.0.index_uuid/]
  295. // TESTRESPONSE[s/"index_uuid": "C6LWyNJHQWmA08aQGvqRkA"/"index_uuid": $body.data_streams.0.indices.1.index_uuid/]
  296. // TESTRESPONSE[s/"status": "GREEN"/"status": "YELLOW"/]
  297. <1> Last item in the `indices` array for the `logs` data stream. This item
  298. contains information about the stream's current write index, `.ds-logs-000002`.
  299. ====
  300. [discrete]
  301. [[secure-a-data-stream]]
  302. === Secure a data stream
  303. You can use {es} {security-features} to control access to a data stream and its
  304. data. See <<data-stream-privileges>>.
  305. [discrete]
  306. [[delete-a-data-stream]]
  307. === Delete a data stream
  308. You can use the <<indices-delete-data-stream,delete data stream API>> to delete
  309. a data stream and its backing indices.
  310. .*Example*
  311. [%collapsible]
  312. ====
  313. The following delete data stream API request deletes the `logs` data stream. This
  314. request also deletes the stream's backing indices and any data they contain.
  315. [source,console]
  316. ----
  317. DELETE /_data_stream/logs
  318. ----
  319. // TEST[continued]
  320. ====
  321. ////
  322. [source,console]
  323. ----
  324. DELETE /_data_stream/*
  325. DELETE /_index_template/*
  326. DELETE /_ilm/policy/logs_policy
  327. ----
  328. // TEST[continued]
  329. ////