rollup-search.asciidoc 7.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286
  1. ifdef::permanently-unreleased-branch[]
  2. [role="xpack"]
  3. [testenv="basic"]
  4. [[rollup-search]]
  5. === Legacy rollup search
  6. ++++
  7. <titleabbrev>Legacy rollup search</titleabbrev>
  8. ++++
  9. include::put-job.asciidoc[tag=legacy-rollup-admon]
  10. Searches legacy rollup data using <<query-dsl,query DSL>>.
  11. endif::[]
  12. ifndef::permanently-unreleased-branch[]
  13. [role="xpack"]
  14. [testenv="basic"]
  15. [[rollup-search]]
  16. === Rollup search
  17. ++++
  18. <titleabbrev>Rollup search</titleabbrev>
  19. ++++
  20. Enables searching rolled-up data using the standard query DSL.
  21. experimental[]
  22. endif::[]
  23. [[rollup-search-request]]
  24. ==== {api-request-title}
  25. `GET <target>/_rollup_search`
  26. [[rollup-search-desc]]
  27. ==== {api-description-title}
  28. The rollup search endpoint is needed because, internally, rolled-up documents
  29. utilize a different document structure than the original data. The rollup search
  30. endpoint rewrites standard query DSL into a format that matches the rollup
  31. documents, then takes the response and rewrites it back to what a client would
  32. expect given the original query.
  33. [[rollup-search-path-params]]
  34. ==== {api-path-parms-title}
  35. `<target>`::
  36. +
  37. --
  38. (Required, string)
  39. Comma-separated list of data streams and indices used to limit
  40. the request. Wildcard expressions (`*`) are supported.
  41. This target can include both rollup and non-rollup indices.
  42. Rules for the `<target>` parameter:
  43. - At least one data stream, index, or wildcard expression must be specified.
  44. This target can include a rollup or non-rollup index. For data streams, the
  45. stream's backing indices can only serve as non-rollup indices. Omitting the
  46. `<target>` parameter or using `_all` is not permitted.
  47. - Multiple non-rollup indices may be specified.
  48. - Only one rollup index may be specified. If more than one are supplied, an
  49. exception occurs.
  50. - Wildcard expressions may be used, but, if they match more than one rollup index, an
  51. exception occurs. However, you can use an expression to match multiple non-rollup
  52. indices or data streams.
  53. --
  54. [[rollup-search-request-body]]
  55. ==== {api-request-body-title}
  56. The request body supports a subset of features from the regular Search API. It
  57. supports:
  58. - `query` param for specifying an DSL query, subject to some limitations
  59. (see <<rollup-search-limitations>> and <<rollup-agg-limitations>>
  60. - `aggregations` param for specifying aggregations
  61. Functionality that is not available:
  62. - `size`: Because rollups work on pre-aggregated data, no search hits can be
  63. returned and so size must be set to zero or omitted entirely.
  64. - `highlighter`, `suggestors`, `post_filter`, `profile`, `explain`: These are
  65. similarly disallowed.
  66. [[rollup-search-example]]
  67. ==== {api-examples-title}
  68. ===== Historical-only search example
  69. Imagine we have an index named `sensor-1` full of raw data, and we have created
  70. a {rollup-job} with the following configuration:
  71. [source,console]
  72. --------------------------------------------------
  73. PUT _rollup/job/sensor
  74. {
  75. "index_pattern": "sensor-*",
  76. "rollup_index": "sensor_rollup",
  77. "cron": "*/30 * * * * ?",
  78. "page_size": 1000,
  79. "groups": {
  80. "date_histogram": {
  81. "field": "timestamp",
  82. "fixed_interval": "1h",
  83. "delay": "7d"
  84. },
  85. "terms": {
  86. "fields": [ "node" ]
  87. }
  88. },
  89. "metrics": [
  90. {
  91. "field": "temperature",
  92. "metrics": [ "min", "max", "sum" ]
  93. },
  94. {
  95. "field": "voltage",
  96. "metrics": [ "avg" ]
  97. }
  98. ]
  99. }
  100. --------------------------------------------------
  101. // TEST[setup:sensor_index]
  102. This rolls up the `sensor-*` pattern and stores the results in `sensor_rollup`.
  103. To search this rolled up data, we need to use the `_rollup_search` endpoint.
  104. However, you'll notice that we can use regular query DSL to search the rolled-up
  105. data:
  106. [source,console]
  107. --------------------------------------------------
  108. GET /sensor_rollup/_rollup_search
  109. {
  110. "size": 0,
  111. "aggregations": {
  112. "max_temperature": {
  113. "max": {
  114. "field": "temperature"
  115. }
  116. }
  117. }
  118. }
  119. --------------------------------------------------
  120. // TEST[setup:sensor_prefab_data]
  121. // TEST[s/_rollup_search/_rollup_search?filter_path=took,timed_out,terminated_early,_shards,hits,aggregations/]
  122. The query is targeting the `sensor_rollup` data, since this contains the rollup
  123. data as configured in the job. A `max` aggregation has been used on the
  124. `temperature` field, yielding the following response:
  125. [source,console-result]
  126. ----
  127. {
  128. "took" : 102,
  129. "timed_out" : false,
  130. "terminated_early" : false,
  131. "_shards" : ... ,
  132. "hits" : {
  133. "total" : {
  134. "value": 0,
  135. "relation": "eq"
  136. },
  137. "max_score" : 0.0,
  138. "hits" : [ ]
  139. },
  140. "aggregations" : {
  141. "max_temperature" : {
  142. "value" : 202.0
  143. }
  144. }
  145. }
  146. ----
  147. // TESTRESPONSE[s/"took" : 102/"took" : $body.$_path/]
  148. // TESTRESPONSE[s/"_shards" : \.\.\. /"_shards" : $body.$_path/]
  149. The response is exactly as you'd expect from a regular query + aggregation; it
  150. provides some metadata about the request (`took`, `_shards`, etc), the search
  151. hits (which is always empty for rollup searches), and the aggregation response.
  152. Rollup searches are limited to functionality that was configured in the
  153. {rollup-job}. For example, we are not able to calculate the average temperature
  154. because `avg` was not one of the configured metrics for the `temperature` field.
  155. If we try to execute that search:
  156. [source,console]
  157. --------------------------------------------------
  158. GET sensor_rollup/_rollup_search
  159. {
  160. "size": 0,
  161. "aggregations": {
  162. "avg_temperature": {
  163. "avg": {
  164. "field": "temperature"
  165. }
  166. }
  167. }
  168. }
  169. --------------------------------------------------
  170. // TEST[continued]
  171. // TEST[catch:/illegal_argument_exception/]
  172. [source,console-result]
  173. ----
  174. {
  175. "error": {
  176. "root_cause": [
  177. {
  178. "type": "illegal_argument_exception",
  179. "reason": "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.",
  180. "stack_trace": ...
  181. }
  182. ],
  183. "type": "illegal_argument_exception",
  184. "reason": "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.",
  185. "stack_trace": ...
  186. },
  187. "status": 400
  188. }
  189. ----
  190. // TESTRESPONSE[s/"stack_trace": \.\.\./"stack_trace": $body.$_path/]
  191. ===== Searching both historical rollup and non-rollup data
  192. The rollup search API has the capability to search across both "live"
  193. non-rollup data and the aggregated rollup data. This is done by simply adding
  194. the live indices to the URI:
  195. [source,console]
  196. --------------------------------------------------
  197. GET sensor-1,sensor_rollup/_rollup_search <1>
  198. {
  199. "size": 0,
  200. "aggregations": {
  201. "max_temperature": {
  202. "max": {
  203. "field": "temperature"
  204. }
  205. }
  206. }
  207. }
  208. --------------------------------------------------
  209. // TEST[continued]
  210. // TEST[s/_rollup_search/_rollup_search?filter_path=took,timed_out,terminated_early,_shards,hits,aggregations/]
  211. <1> Note the URI now searches `sensor-1` and `sensor_rollup` at the same time
  212. When the search is executed, the rollup search endpoint does two things:
  213. 1. The original request is sent to the non-rollup index unaltered.
  214. 2. A rewritten version of the original request is sent to the rollup index.
  215. When the two responses are received, the endpoint rewrites the rollup response
  216. and merges the two together. During the merging process, if there is any overlap
  217. in buckets between the two responses, the buckets from the non-rollup index are
  218. used.
  219. The response to the above query looks as expected, despite spanning rollup and
  220. non-rollup indices:
  221. [source,console-result]
  222. ----
  223. {
  224. "took" : 102,
  225. "timed_out" : false,
  226. "terminated_early" : false,
  227. "_shards" : ... ,
  228. "hits" : {
  229. "total" : {
  230. "value": 0,
  231. "relation": "eq"
  232. },
  233. "max_score" : 0.0,
  234. "hits" : [ ]
  235. },
  236. "aggregations" : {
  237. "max_temperature" : {
  238. "value" : 202.0
  239. }
  240. }
  241. }
  242. ----
  243. // TESTRESPONSE[s/"took" : 102/"took" : $body.$_path/]
  244. // TESTRESPONSE[s/"_shards" : \.\.\. /"_shards" : $body.$_path/]