aggregations.asciidoc 9.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421
  1. [[search-aggregations]]
  2. = Aggregations
  3. [partintro]
  4. --
  5. An aggregation summarizes your data as metrics, statistics, or other analytics.
  6. Aggregations help you answer questions like:
  7. * What's the average load time for my website?
  8. * Who are my most valuable customers based on transaction volume?
  9. * What would be considered a large file on my network?
  10. * How many products are in each product category?
  11. {es} organizes aggregations into three categories:
  12. * <<search-aggregations-metrics,Metric>> aggregations that calculate metrics,
  13. such as a sum or average, from field values.
  14. * <<search-aggregations-bucket,Bucket>> aggregations that
  15. group documents into buckets, also called bins, based on field values, ranges,
  16. or other criteria.
  17. * <<search-aggregations-pipeline,Pipeline>> aggregations that take input from
  18. other aggregations instead of documents or fields.
  19. [discrete]
  20. [[run-an-agg]]
  21. === Run an aggregation
  22. You can run aggregations as part of a <<search-your-data,search>> by specifying the <<search-search,search API>>'s `aggs` parameter. The
  23. following search runs a
  24. <<search-aggregations-bucket-terms-aggregation,terms aggregation>> on
  25. `my-field`:
  26. [source,console]
  27. ----
  28. GET /my-index-000001/_search
  29. {
  30. "aggs": {
  31. "my-agg-name": {
  32. "terms": {
  33. "field": "my-field"
  34. }
  35. }
  36. }
  37. }
  38. ----
  39. // TEST[setup:my_index]
  40. // TEST[s/my-field/http.request.method/]
  41. Aggregation results are in the response's `aggregations` object:
  42. [source,console-result]
  43. ----
  44. {
  45. "took": 78,
  46. "timed_out": false,
  47. "_shards": {
  48. "total": 1,
  49. "successful": 1,
  50. "skipped": 0,
  51. "failed": 0
  52. },
  53. "hits": {
  54. "total": {
  55. "value": 5,
  56. "relation": "eq"
  57. },
  58. "max_score": 1.0,
  59. "hits": [...]
  60. },
  61. "aggregations": {
  62. "my-agg-name": { <1>
  63. "doc_count_error_upper_bound": 0,
  64. "sum_other_doc_count": 0,
  65. "buckets": []
  66. }
  67. }
  68. }
  69. ----
  70. // TESTRESPONSE[s/"took": 78/"took": "$body.took"/]
  71. // TESTRESPONSE[s/\.\.\.$/"took": "$body.took", "timed_out": false, "_shards": "$body._shards", /]
  72. // TESTRESPONSE[s/"hits": \[\.\.\.\]/"hits": "$body.hits.hits"/]
  73. // TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":"get","doc_count":5\}\]/]
  74. <1> Results for the `my-agg-name` aggregation.
  75. [discrete]
  76. [[change-agg-scope]]
  77. === Change an aggregation's scope
  78. Use the `query` parameter to limit the documents on which an aggregation runs:
  79. [source,console]
  80. ----
  81. GET /my-index-000001/_search
  82. {
  83. "query": {
  84. "range": {
  85. "@timestamp": {
  86. "gte": "now-1d/d",
  87. "lt": "now/d"
  88. }
  89. }
  90. },
  91. "aggs": {
  92. "my-agg-name": {
  93. "terms": {
  94. "field": "my-field"
  95. }
  96. }
  97. }
  98. }
  99. ----
  100. // TEST[setup:my_index]
  101. // TEST[s/my-field/http.request.method/]
  102. [discrete]
  103. [[return-only-agg-results]]
  104. === Return only aggregation results
  105. By default, searches containing an aggregation return both search hits and
  106. aggregation results. To return only aggregation results, set `size` to `0`:
  107. [source,console]
  108. ----
  109. GET /my-index-000001/_search
  110. {
  111. "size": 0,
  112. "aggs": {
  113. "my-agg-name": {
  114. "terms": {
  115. "field": "my-field"
  116. }
  117. }
  118. }
  119. }
  120. ----
  121. // TEST[setup:my_index]
  122. // TEST[s/my-field/http.request.method/]
  123. [discrete]
  124. [[run-multiple-aggs]]
  125. === Run multiple aggregations
  126. You can specify multiple aggregations in the same request:
  127. [source,console]
  128. ----
  129. GET /my-index-000001/_search
  130. {
  131. "aggs": {
  132. "my-first-agg-name": {
  133. "terms": {
  134. "field": "my-field"
  135. }
  136. },
  137. "my-second-agg-name": {
  138. "avg": {
  139. "field": "my-other-field"
  140. }
  141. }
  142. }
  143. }
  144. ----
  145. // TEST[setup:my_index]
  146. // TEST[s/my-field/http.request.method/]
  147. // TEST[s/my-other-field/http.response.bytes/]
  148. [discrete]
  149. [[run-sub-aggs]]
  150. === Run sub-aggregations
  151. Bucket aggregations support bucket or metric sub-aggregations. For example, a
  152. terms aggregation with an <<search-aggregations-metrics-avg-aggregation,avg>>
  153. sub-aggregation calculates an average value for each bucket of documents. There
  154. is no level or depth limit for nesting sub-aggregations.
  155. [source,console]
  156. ----
  157. GET /my-index-000001/_search
  158. {
  159. "aggs": {
  160. "my-agg-name": {
  161. "terms": {
  162. "field": "my-field"
  163. },
  164. "aggs": {
  165. "my-sub-agg-name": {
  166. "avg": {
  167. "field": "my-other-field"
  168. }
  169. }
  170. }
  171. }
  172. }
  173. }
  174. ----
  175. // TEST[setup:my_index]
  176. // TEST[s/_search/_search?size=0/]
  177. // TEST[s/my-field/http.request.method/]
  178. // TEST[s/my-other-field/http.response.bytes/]
  179. The response nests sub-aggregation results under their parent aggregation:
  180. [source,console-result]
  181. ----
  182. {
  183. ...
  184. "aggregations": {
  185. "my-agg-name": { <1>
  186. "doc_count_error_upper_bound": 0,
  187. "sum_other_doc_count": 0,
  188. "buckets": [
  189. {
  190. "key": "foo",
  191. "doc_count": 5,
  192. "my-sub-agg-name": { <2>
  193. "value": 75.0
  194. }
  195. }
  196. ]
  197. }
  198. }
  199. }
  200. ----
  201. // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
  202. // TESTRESPONSE[s/"key": "foo"/"key": "get"/]
  203. // TESTRESPONSE[s/"value": 75.0/"value": $body.aggregations.my-agg-name.buckets.0.my-sub-agg-name.value/]
  204. <1> Results for the parent aggregation, `my-agg-name`.
  205. <2> Results for `my-agg-name`'s sub-aggregation, `my-sub-agg-name`.
  206. [discrete]
  207. [[add-metadata-to-an-agg]]
  208. === Add custom metadata
  209. Use the `meta` object to associate custom metadata with an aggregation:
  210. [source,console]
  211. ----
  212. GET /my-index-000001/_search
  213. {
  214. "aggs": {
  215. "my-agg-name": {
  216. "terms": {
  217. "field": "my-field"
  218. },
  219. "meta": {
  220. "my-metadata-field": "foo"
  221. }
  222. }
  223. }
  224. }
  225. ----
  226. // TEST[setup:my_index]
  227. // TEST[s/_search/_search?size=0/]
  228. The response returns the `meta` object in place:
  229. [source,console-result]
  230. ----
  231. {
  232. ...
  233. "aggregations": {
  234. "my-agg-name": {
  235. "meta": {
  236. "my-metadata-field": "foo"
  237. },
  238. "doc_count_error_upper_bound": 0,
  239. "sum_other_doc_count": 0,
  240. "buckets": []
  241. }
  242. }
  243. }
  244. ----
  245. // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
  246. [discrete]
  247. [[return-agg-type]]
  248. === Return the aggregation type
  249. By default, aggregation results include the aggregation's name but not its type.
  250. To return the aggregation type, use the `typed_keys` query parameter.
  251. [source,console]
  252. ----
  253. GET /my-index-000001/_search?typed_keys
  254. {
  255. "aggs": {
  256. "my-agg-name": {
  257. "histogram": {
  258. "field": "my-field",
  259. "interval": 1000
  260. }
  261. }
  262. }
  263. }
  264. ----
  265. // TEST[setup:my_index]
  266. // TEST[s/typed_keys/typed_keys&size=0/]
  267. // TEST[s/my-field/http.response.bytes/]
  268. The response returns the aggregation type as a prefix to the aggregation's name.
  269. IMPORTANT: Some aggregations return a different aggregation type from the
  270. type in the request. For example, the terms,
  271. <<search-aggregations-bucket-significantterms-aggregation,significant terms>>,
  272. and <<search-aggregations-metrics-percentile-aggregation,percentiles>>
  273. aggregations return different aggregations types depending on the data type of
  274. the aggregated field.
  275. [source,console-result]
  276. ----
  277. {
  278. ...
  279. "aggregations": {
  280. "histogram#my-agg-name": { <1>
  281. "buckets": []
  282. }
  283. }
  284. }
  285. ----
  286. // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
  287. // TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":1070000.0,"doc_count":5\}\]/]
  288. <1> The aggregation type, `histogram`, followed by a `#` separator and the aggregation's name, `my-agg-name`.
  289. [discrete]
  290. [[use-scripts-in-an-agg]]
  291. === Use scripts in an aggregation
  292. When a field doesn't exactly match the aggregation you need, you
  293. should aggregate on a <<runtime,runtime field>>:
  294. [source,console]
  295. ----
  296. GET /my-index-000001/_search?size=0
  297. {
  298. "runtime_mappings": {
  299. "message.length": {
  300. "type": "long",
  301. "script": "emit(doc['message.keyword'].value.length())"
  302. }
  303. },
  304. "aggs": {
  305. "message_length": {
  306. "histogram": {
  307. "interval": 10,
  308. "field": "message.length"
  309. }
  310. }
  311. }
  312. }
  313. ----
  314. // TEST[setup:my_index]
  315. ////
  316. [source,console-result]
  317. ----
  318. {
  319. "timed_out": false,
  320. "took": "$body.took",
  321. "_shards": {
  322. "total": 1,
  323. "successful": 1,
  324. "failed": 0,
  325. "skipped": 0
  326. },
  327. "hits": "$body.hits",
  328. "aggregations": {
  329. "message_length": {
  330. "buckets": [
  331. {
  332. "key": 30.0,
  333. "doc_count": 5
  334. }
  335. ]
  336. }
  337. }
  338. }
  339. ----
  340. ////
  341. Scripts calculate field values dynamically, which adds a little
  342. overhead to the aggregation. In addition to the time spent calculating,
  343. some aggregations like <<search-aggregations-bucket-terms-aggregation,`terms`>>
  344. and <<search-aggregations-bucket-filters-aggregation,`filters`>> can't use
  345. some of their optimizations with runtime fields. In total, performance costs
  346. for using a runtime field varies from aggregation to aggregation.
  347. // TODO when we have calculated fields we can link to them here.
  348. [discrete]
  349. [[agg-caches]]
  350. === Aggregation caches
  351. For faster responses, {es} caches the results of frequently run aggregations in
  352. the <<shard-request-cache,shard request cache>>. To get cached results, use the
  353. same <<shard-and-node-preference,`preference` string>> for each search. If you
  354. don't need search hits, <<return-only-agg-results,set `size` to `0`>> to avoid
  355. filling the cache.
  356. {es} routes searches with the same preference string to the same shards. If the
  357. shards' data doesn’t change between searches, the shards return cached
  358. aggregation results.
  359. [discrete]
  360. [[limits-for-long-values]]
  361. === Limits for `long` values
  362. When running aggregations, {es} uses <<number,`double`>> values to hold and
  363. represent numeric data. As a result, aggregations on <<number,`long`>> numbers
  364. greater than +2^53^+ are approximate.
  365. --
  366. include::aggregations/bucket.asciidoc[]
  367. include::aggregations/metrics.asciidoc[]
  368. include::aggregations/pipeline.asciidoc[]