set-up-a-data-stream.asciidoc 9.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359
  1. [[set-up-a-data-stream]]
  2. == Set up a data stream
  3. To set up a data stream, follow these steps:
  4. . Check the <<data-stream-prereqs, prerequisites>>.
  5. . <<configure-a-data-stream-ilm-policy>>.
  6. . <<create-a-data-stream-template>>.
  7. . <<create-a-data-stream>>.
  8. . <<get-info-about-a-data-stream>> to verify it exists.
  9. After you set up a data stream, you can <<use-a-data-stream, use the data
  10. stream>> for indexing, searches, and other supported operations.
  11. If you no longer need it, you can <<delete-a-data-stream,delete a data stream>>
  12. and its backing indices.
  13. [discrete]
  14. [[data-stream-prereqs]]
  15. === Prerequisites
  16. * {es} data streams are intended for time-series data only. Each document
  17. indexed to a data stream must contain a shared timestamp field.
  18. +
  19. TIP: Data streams work well with most common log formats. While no schema is
  20. required to use data streams, we recommend the {ecs-ref}[Elastic Common Schema
  21. (ECS)].
  22. * Data streams are designed to be <<data-streams-append-only,append-only>>.
  23. While you can index new documents directly to a data stream, you cannot use a
  24. data stream to directly update or delete individual documents. To update or
  25. delete specific documents in a data stream, submit a <<docs-delete,delete>> or
  26. <<docs-update,update>> API request to the backing index containing the document.
  27. [discrete]
  28. [[configure-a-data-stream-ilm-policy]]
  29. === Optional: Configure an {ilm-init} lifecycle policy for a data stream
  30. You can use <<index-lifecycle-management,{ilm} ({ilm-init})>> to automatically
  31. manage a data stream's backing indices. For example, you could use {ilm-init}
  32. to:
  33. * Spin up a new write index for the data stream when the current one reaches a
  34. certain size or age.
  35. * Move older backing indices to slower, less expensive hardware.
  36. * Delete stale backing indices to enforce data retention standards.
  37. To use {ilm-init} with a data stream, you must
  38. <<set-up-lifecycle-policy,configure a lifecycle policy>>. This lifecycle policy
  39. should contain the automated actions to take on backing indices and the
  40. triggers for such actions.
  41. TIP: While optional, we recommend using {ilm-init} to scale data streams in
  42. production.
  43. .*Example*
  44. [%collapsible]
  45. ====
  46. The following <<ilm-put-lifecycle,create lifecycle policy API>> request
  47. configures the `logs_policy` lifecycle policy.
  48. The `logs_policy` policy uses the <<ilm-rollover,`rollover` action>> to create a
  49. new <<data-stream-write-index,write index>> for the data stream when the current
  50. one reaches 25GB in size. The policy also deletes backing indices 30 days after
  51. their rollover.
  52. [source,console]
  53. ----
  54. PUT /_ilm/policy/logs_policy
  55. {
  56. "policy": {
  57. "phases": {
  58. "hot": {
  59. "actions": {
  60. "rollover": {
  61. "max_size": "25GB"
  62. }
  63. }
  64. },
  65. "delete": {
  66. "min_age": "30d",
  67. "actions": {
  68. "delete": {}
  69. }
  70. }
  71. }
  72. }
  73. }
  74. ----
  75. ====
  76. [discrete]
  77. [[create-a-data-stream-template]]
  78. === Create a composable template for a data stream
  79. Each data stream requires a <<indices-templates,composable template>>. The data
  80. stream uses this template to create its backing indices.
  81. Composable templates for data streams must contain:
  82. * A name or wildcard (`*`) pattern for the data stream in the `index_patterns`
  83. property.
  84. * A `data_stream` definition containing the `timestamp_field` property.
  85. This timestamp field must be included in every document indexed to the data
  86. stream.
  87. * A <<date,`date`>> or <<date_nanos,`date_nanos`>> field mapping for the
  88. timestamp field specified in the `timestamp_field` property.
  89. * If you intend to use {ilm-init}, you must specify the
  90. <<configure-a-data-stream-ilm-policy,lifecycle policy>> in the
  91. `index.lifecycle.name` setting.
  92. You can also specify other mappings and settings you'd like to apply to the
  93. stream's backing indices.
  94. TIP: We recommend you carefully consider which mappings and settings to include
  95. in this template before creating a data stream. Later changes to the mappings or
  96. settings of a stream's backing indices may require reindexing. See
  97. <<data-streams-change-mappings-and-settings>>.
  98. .*Example*
  99. [%collapsible]
  100. ====
  101. The following <<indices-templates,put composable template API>> request
  102. configures the `logs_data_stream` template.
  103. [source,console]
  104. ----
  105. PUT /_index_template/logs_data_stream
  106. {
  107. "index_patterns": [ "logs*" ],
  108. "data_stream": {
  109. "timestamp_field": "@timestamp"
  110. },
  111. "template": {
  112. "mappings": {
  113. "properties": {
  114. "@timestamp": {
  115. "type": "date"
  116. }
  117. }
  118. },
  119. "settings": {
  120. "index.lifecycle.name": "logs_policy"
  121. }
  122. }
  123. }
  124. ----
  125. // TEST[continued]
  126. ====
  127. [discrete]
  128. [[create-a-data-stream]]
  129. === Create a data stream
  130. With a composable template, you can create a data stream using one of two
  131. methods:
  132. * Submit an <<add-documents-to-a-data-stream,indexing request>> to a target
  133. matching the name or wildcard pattern defined in the template's `index_patterns`
  134. property.
  135. +
  136. --
  137. If the indexing request's target doesn't exist, {es} creates the data stream and
  138. uses the target name as the name for the stream.
  139. NOTE: Data streams support only specific types of indexing requests. See
  140. <<add-documents-to-a-data-stream>>.
  141. [[index-documents-to-create-a-data-stream]]
  142. .*Example: Index documents to create a data stream*
  143. [%collapsible]
  144. ====
  145. The following <<docs-index_,index API>> request targets `logs`, which matches
  146. the wildcard pattern for the `logs_data_stream` template. Because no existing
  147. index or data stream uses this name, this request creates the `logs` data stream
  148. and indexes the document to it.
  149. [source,console]
  150. ----
  151. POST /logs/_doc/
  152. {
  153. "@timestamp": "2020-12-06T11:04:05.000Z",
  154. "user": {
  155. "id": "vlb44hny"
  156. },
  157. "message": "Login attempt failed"
  158. }
  159. ----
  160. // TEST[continued]
  161. The API returns the following response. Note the `_index` property contains
  162. `.ds-logs-000001`, indicating the document was indexed to the write index of the
  163. new `logs` data stream.
  164. [source,console-result]
  165. ----
  166. {
  167. "_index": ".ds-logs-000001",
  168. "_id": "qecQmXIBT4jB8tq1nG0j",
  169. "_version": 1,
  170. "result": "created",
  171. "_shards": {
  172. "total": 2,
  173. "successful": 1,
  174. "failed": 0
  175. },
  176. "_seq_no": 0,
  177. "_primary_term": 1
  178. }
  179. ----
  180. // TESTRESPONSE[s/"_id": "qecQmXIBT4jB8tq1nG0j"/"_id": $body._id/]
  181. ====
  182. --
  183. * Use the <<indices-create-data-stream,create data stream API>> to manually
  184. create a data stream. The name of the data stream must match the
  185. name or wildcard pattern defined in the template's `index_patterns` property.
  186. +
  187. --
  188. .*Example: Manually create a data stream*
  189. [%collapsible]
  190. ====
  191. The following <<indices-create-data-stream,create data stream API>> request
  192. targets `logs_alt`, which matches the wildcard pattern for the
  193. `logs_data_stream` template. Because no existing index or data stream uses this
  194. name, this request creates the `logs_alt` data stream.
  195. [source,console]
  196. ----
  197. PUT /_data_stream/logs_alt
  198. ----
  199. // TEST[continued]
  200. ====
  201. --
  202. ////
  203. [source,console]
  204. ----
  205. DELETE /_data_stream/logs
  206. DELETE /_data_stream/logs_alt
  207. DELETE /_index_template/logs_data_stream
  208. DELETE /_ilm/policy/logs_policy
  209. ----
  210. // TEST[continued]
  211. ////
  212. [discrete]
  213. [[get-info-about-a-data-stream]]
  214. === Get information about a data stream
  215. You can use the <<indices-get-data-stream,get data stream API>> to get
  216. information about one or more data streams, including:
  217. * The timestamp field
  218. * The current backing indices, which is returned as an array. The last item in
  219. the array contains information about the stream's current write index.
  220. * The current generation
  221. This is also handy way to verify that a recently created data stream exists.
  222. .*Example*
  223. [%collapsible]
  224. ====
  225. The following get data stream API request retrieves information about any data
  226. streams starting with `logs`.
  227. [source,console]
  228. ----
  229. GET /_data_stream/logs*
  230. ----
  231. // TEST[skip: shard failures]
  232. The API returns the following response, which includes information about the
  233. `logs` data stream. Note the `indices` property contains an array of the
  234. stream's current backing indices. The last item in this array contains
  235. information for the `logs` stream's write index, `.ds-logs-000002`.
  236. [source,console-result]
  237. ----
  238. [
  239. {
  240. "name": "logs",
  241. "timestamp_field": "@timestamp",
  242. "indices": [
  243. {
  244. "index_name": ".ds-logs-000001",
  245. "index_uuid": "DXAE-xcCQTKF93bMm9iawA"
  246. },
  247. {
  248. "index_name": ".ds-logs-000002",
  249. "index_uuid": "Wzxq0VhsQKyPxHhaK3WYAg"
  250. }
  251. ],
  252. "generation": 2
  253. }
  254. ]
  255. ----
  256. // TESTRESPONSE[skip:unable to assert responses with top level array]
  257. ====
  258. [discrete]
  259. [[delete-a-data-stream]]
  260. === Delete a data stream
  261. You can use the <<indices-delete-data-stream,delete data stream API>> to delete
  262. a data stream and its backing indices.
  263. .*Example*
  264. [%collapsible]
  265. ====
  266. The following delete data stream API request deletes the `logs` data stream. This
  267. request also deletes the stream's backing indices and any data they contain.
  268. ////
  269. [source,console]
  270. ----
  271. PUT /_index_template/logs_data_stream
  272. {
  273. "index_patterns": [ "logs*" ],
  274. "data_stream": {
  275. "timestamp_field": "@timestamp"
  276. },
  277. "template": {
  278. "mappings": {
  279. "properties": {
  280. "@timestamp": {
  281. "type": "date"
  282. }
  283. }
  284. }
  285. }
  286. }
  287. PUT /_data_stream/logs
  288. ----
  289. ////
  290. [source,console]
  291. ----
  292. DELETE /_data_stream/logs
  293. ----
  294. // TEST[continued]
  295. ====
  296. ////
  297. [source,console]
  298. ----
  299. DELETE /_index_template/logs_data_stream
  300. ----
  301. // TEST[continued]
  302. ////