downsample-data-stream.asciidoc 4.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160
  1. [role="xpack"]
  2. [[indices-downsample-data-stream]]
  3. === Downsample index API
  4. ++++
  5. <titleabbrev>Downsample</titleabbrev>
  6. ++++
  7. preview::[]
  8. Aggregates a time series (TSDS) index and stores
  9. pre-computed statistical summaries (`min`, `max`, `sum`, `value_count` and
  10. `avg`) for each metric field grouped by a configured time interval. For example,
  11. a TSDS index that contains metrics sampled every 10 seconds can be downsampled
  12. to an hourly index. All documents within an hour interval are summarized and
  13. stored as a single document in the downsample index.
  14. // tag::downsample-example[]
  15. ////
  16. [source,console]
  17. ----
  18. PUT /my-time-series-index
  19. {
  20. "settings": {
  21. "index": {
  22. "mode": "time_series",
  23. "time_series": {
  24. "start_time": "2022-06-10T00:00:00Z",
  25. "end_time": "2022-06-30T23:59:59Z"
  26. },
  27. "routing_path": [
  28. "test.namespace"
  29. ],
  30. "number_of_replicas": 0,
  31. "number_of_shards": 2
  32. }
  33. },
  34. "mappings": {
  35. "properties": {
  36. "@timestamp": {
  37. "type": "date"
  38. },
  39. "metric": {
  40. "type": "long",
  41. "time_series_metric": "gauge"
  42. },
  43. "dimension": {
  44. "type": "keyword",
  45. "time_series_dimension": true
  46. }
  47. }
  48. }
  49. }
  50. PUT /my-time-series-index/_block/write
  51. ----
  52. // TEST
  53. ////
  54. [source,console]
  55. ----
  56. POST /my-time-series-index/_downsample/my-downsampled-time-series-index
  57. {
  58. "fixed_interval": "1d"
  59. }
  60. ----
  61. // TEST[continued]
  62. ////
  63. [source,console]
  64. ----
  65. DELETE /my-time-series-index*
  66. DELETE _data_stream/*
  67. DELETE _index_template/*
  68. ----
  69. // TEST[continued]
  70. ////
  71. // end::downsample-example[]
  72. [[downsample-api-request]]
  73. ==== {api-request-title}
  74. `POST /<source-index>/_downsample/<output-downsampled-index>`
  75. [[downsample-api-prereqs]]
  76. ==== {api-prereq-title}
  77. * Only indices in a <<tsds,time series data stream>> are supported.
  78. * If the {es} {security-features} are enabled, you must have the `all`
  79. or `manage` <<privileges-list-indices,index privilege>> for the data stream.
  80. * Neither <<field-and-document-access-control,field nor document level security>> can be defined on the source index.
  81. * The source index must be read only (`index.blocks.write: true`).
  82. [[downsample-api-path-params]]
  83. ==== {api-path-parms-title}
  84. `<source-index>`::
  85. (Optional, string) Name of the time series index to downsample.
  86. `<output-downsampled_index>`::
  87. +
  88. --
  89. (Required, string) Name of the index to create.
  90. include::{es-repo-dir}/indices/create-index.asciidoc[tag=index-name-reqs]
  91. --
  92. [role="child_attributes"]
  93. [[downsample-api-query-parms]]
  94. ==== {api-query-parms-title}
  95. `fixed_interval`:: (Required, <<time-units,time units>>) The interval at which
  96. to aggregate the original time series index. For example, `60m` produces a
  97. document for each 60 minute (hourly) interval. This follows standard time
  98. formatting syntax as used elsewhere in {es}.
  99. +
  100. NOTE: Smaller, more granular intervals take up proportionally more space.
  101. [[downsample-api-process]]
  102. ==== The downsampling process
  103. The downsampling operation traverses the source TSDS index and performs the
  104. following steps:
  105. . Creates a new document for each value of the `_tsid` field and each
  106. `@timestamp` value, rounded to the `fixed_interval` defined in the downsample
  107. configuration.
  108. . For each new document, copies all <<time-series-dimension,time
  109. series dimensions>> from the source index to the target index. Dimensions in a
  110. TSDS are constant, so this is done only once per bucket.
  111. . For each <<time-series-metric,time series metric>> field, computes aggregations
  112. for all documents in the bucket. Depending on the metric type of each metric
  113. field a different set of pre-aggregated results is stored:
  114. ** `gauge`: The `min`, `max`, `sum`, and `value_count` are stored; `value_count`
  115. is stored as type `aggregate_metric_double`.
  116. ** `counter`: The `last_value` is stored.
  117. . For all other fields, the most recent value is copied to the target index.
  118. [[downsample-api-mappings]]
  119. ==== Source and target index field mappings
  120. Fields in the target, downsampled index are created based on fields in the
  121. original source index, as follows:
  122. . All fields mapped with the `time-series-dimension` parameter are created in
  123. the target downsample index with the same mapping as in the source index.
  124. . All fields mapped with the `time_series_metric` parameter are created
  125. in the target downsample index with the same mapping as in the source
  126. index. An exception is that for fields mapped as `time_series_metric: gauge`
  127. the field type is changed to `aggregate_metric_double`.
  128. . All other fields that are neither dimensions nor metrics (that is, label
  129. fields), are created in the target downsample index with the same mapping
  130. that they had in the source index.
  131. Check the <<downsampling,Downsampling>> documentation for an overview and
  132. examples of running downsampling manually and as part of an ILM policy.