range-aggregation.asciidoc 9.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443
  1. [[search-aggregations-bucket-range-aggregation]]
  2. === Range aggregation
  3. ++++
  4. <titleabbrev>Range</titleabbrev>
  5. ++++
  6. A multi-bucket value source based aggregation that enables the user to define a set of ranges - each representing a bucket. During the aggregation process, the values extracted from each document will be checked against each bucket range and "bucket" the relevant/matching document.
  7. Note that this aggregation includes the `from` value and excludes the `to` value for each range.
  8. Example:
  9. [source,console,id=range-aggregation-example]
  10. ----
  11. GET sales/_search
  12. {
  13. "aggs": {
  14. "price_ranges": {
  15. "range": {
  16. "field": "price",
  17. "ranges": [
  18. { "to": 100.0 },
  19. { "from": 100.0, "to": 200.0 },
  20. { "from": 200.0 }
  21. ]
  22. }
  23. }
  24. }
  25. }
  26. ----
  27. // TEST[setup:sales]
  28. // TEST[s/_search/_search\?filter_path=aggregations/]
  29. Response:
  30. [source,console-result]
  31. ----
  32. {
  33. ...
  34. "aggregations": {
  35. "price_ranges": {
  36. "buckets": [
  37. {
  38. "key": "*-100.0",
  39. "to": 100.0,
  40. "doc_count": 2
  41. },
  42. {
  43. "key": "100.0-200.0",
  44. "from": 100.0,
  45. "to": 200.0,
  46. "doc_count": 2
  47. },
  48. {
  49. "key": "200.0-*",
  50. "from": 200.0,
  51. "doc_count": 3
  52. }
  53. ]
  54. }
  55. }
  56. }
  57. ----
  58. // TESTRESPONSE[s/\.\.\.//]
  59. ==== Keyed Response
  60. Setting the `keyed` flag to `true` will associate a unique string key with each bucket and return the ranges as a hash rather than an array:
  61. [source,console,id=range-aggregation-keyed-example]
  62. ----
  63. GET sales/_search
  64. {
  65. "aggs": {
  66. "price_ranges": {
  67. "range": {
  68. "field": "price",
  69. "keyed": true,
  70. "ranges": [
  71. { "to": 100 },
  72. { "from": 100, "to": 200 },
  73. { "from": 200 }
  74. ]
  75. }
  76. }
  77. }
  78. }
  79. ----
  80. // TEST[setup:sales]
  81. // TEST[s/_search/_search\?filter_path=aggregations/]
  82. Response:
  83. [source,console-result]
  84. ----
  85. {
  86. ...
  87. "aggregations": {
  88. "price_ranges": {
  89. "buckets": {
  90. "*-100.0": {
  91. "to": 100.0,
  92. "doc_count": 2
  93. },
  94. "100.0-200.0": {
  95. "from": 100.0,
  96. "to": 200.0,
  97. "doc_count": 2
  98. },
  99. "200.0-*": {
  100. "from": 200.0,
  101. "doc_count": 3
  102. }
  103. }
  104. }
  105. }
  106. }
  107. ----
  108. // TESTRESPONSE[s/\.\.\.//]
  109. It is also possible to customize the key for each range:
  110. [source,console,id=range-aggregation-custom-keys-example]
  111. ----
  112. GET sales/_search
  113. {
  114. "aggs": {
  115. "price_ranges": {
  116. "range": {
  117. "field": "price",
  118. "keyed": true,
  119. "ranges": [
  120. { "key": "cheap", "to": 100 },
  121. { "key": "average", "from": 100, "to": 200 },
  122. { "key": "expensive", "from": 200 }
  123. ]
  124. }
  125. }
  126. }
  127. }
  128. ----
  129. // TEST[setup:sales]
  130. // TEST[s/_search/_search\?filter_path=aggregations/]
  131. Response:
  132. [source,console-result]
  133. ----
  134. {
  135. ...
  136. "aggregations": {
  137. "price_ranges": {
  138. "buckets": {
  139. "cheap": {
  140. "to": 100.0,
  141. "doc_count": 2
  142. },
  143. "average": {
  144. "from": 100.0,
  145. "to": 200.0,
  146. "doc_count": 2
  147. },
  148. "expensive": {
  149. "from": 200.0,
  150. "doc_count": 3
  151. }
  152. }
  153. }
  154. }
  155. }
  156. ----
  157. // TESTRESPONSE[s/\.\.\.//]
  158. ==== Script
  159. If the data in your documents doesn't exactly match what you'd like to aggregate,
  160. use a <<runtime,runtime field>>. For example, if you need to
  161. apply a particular currency conversion rate:
  162. [source,console,id=range-aggregation-runtime-field-example]
  163. ----
  164. GET sales/_search
  165. {
  166. "runtime_mappings": {
  167. "price.euros": {
  168. "type": "double",
  169. "script": {
  170. "source": """
  171. emit(doc['price'].value * params.conversion_rate)
  172. """,
  173. "params": {
  174. "conversion_rate": 0.835526591
  175. }
  176. }
  177. }
  178. },
  179. "aggs": {
  180. "price_ranges": {
  181. "range": {
  182. "field": "price.euros",
  183. "ranges": [
  184. { "to": 100 },
  185. { "from": 100, "to": 200 },
  186. { "from": 200 }
  187. ]
  188. }
  189. }
  190. }
  191. }
  192. ----
  193. // TEST[setup:sales]
  194. // TEST[s/_search/_search\?filter_path=aggregations/]
  195. //////////////////////////
  196. [source,console-result]
  197. ----
  198. {
  199. "aggregations": {
  200. "price_ranges": {
  201. "buckets": [
  202. {
  203. "key": "*-100.0",
  204. "to": 100.0,
  205. "doc_count": 2
  206. },
  207. {
  208. "key": "100.0-200.0",
  209. "from": 100.0,
  210. "to": 200.0,
  211. "doc_count": 5
  212. },
  213. {
  214. "key": "200.0-*",
  215. "from": 200.0,
  216. "doc_count": 0
  217. }
  218. ]
  219. }
  220. }
  221. }
  222. ----
  223. //////////////////////////
  224. ==== Sub Aggregations
  225. The following example, not only "bucket" the documents to the different buckets but also computes statistics over the prices in each price range
  226. [source,console,id=range-aggregation-sub-aggregation-example]
  227. ----
  228. GET sales/_search
  229. {
  230. "aggs": {
  231. "price_ranges": {
  232. "range": {
  233. "field": "price",
  234. "ranges": [
  235. { "to": 100 },
  236. { "from": 100, "to": 200 },
  237. { "from": 200 }
  238. ]
  239. },
  240. "aggs": {
  241. "price_stats": {
  242. "stats": { "field": "price" }
  243. }
  244. }
  245. }
  246. }
  247. }
  248. ----
  249. // TEST[setup:sales]
  250. // TEST[s/_search/_search\?filter_path=aggregations/]
  251. Response:
  252. [source,console-result]
  253. ----
  254. {
  255. ...
  256. "aggregations": {
  257. "price_ranges": {
  258. "buckets": [
  259. {
  260. "key": "*-100.0",
  261. "to": 100.0,
  262. "doc_count": 2,
  263. "price_stats": {
  264. "count": 2,
  265. "min": 10.0,
  266. "max": 50.0,
  267. "avg": 30.0,
  268. "sum": 60.0
  269. }
  270. },
  271. {
  272. "key": "100.0-200.0",
  273. "from": 100.0,
  274. "to": 200.0,
  275. "doc_count": 2,
  276. "price_stats": {
  277. "count": 2,
  278. "min": 150.0,
  279. "max": 175.0,
  280. "avg": 162.5,
  281. "sum": 325.0
  282. }
  283. },
  284. {
  285. "key": "200.0-*",
  286. "from": 200.0,
  287. "doc_count": 3,
  288. "price_stats": {
  289. "count": 3,
  290. "min": 200.0,
  291. "max": 200.0,
  292. "avg": 200.0,
  293. "sum": 600.0
  294. }
  295. }
  296. ]
  297. }
  298. }
  299. }
  300. ----
  301. // TESTRESPONSE[s/\.\.\.//]
  302. [[search-aggregations-bucket-range-aggregation-histogram-fields]]
  303. ==== Histogram fields
  304. Running a range aggregation over histogram fields computes the total number of counts for each configured range.
  305. This is done without interpolating between the histogram field values. Consequently, it is possible to have a range
  306. that is "in-between" two histogram values. The resulting range bucket would have a zero doc count.
  307. Here is an example, executing a range aggregation against the following index that stores pre-aggregated histograms
  308. with latency metrics (in milliseconds) for different networks:
  309. [source,console]
  310. ----
  311. PUT metrics_index
  312. {
  313. "mappings": {
  314. "properties": {
  315. "network": {
  316. "properties": {
  317. "name": {
  318. "type": "keyword"
  319. }
  320. }
  321. },
  322. "latency_histo": {
  323. "type": "histogram"
  324. }
  325. }
  326. }
  327. }
  328. PUT metrics_index/_doc/1?refresh
  329. {
  330. "network.name" : "net-1",
  331. "latency_histo" : {
  332. "values" : [1, 3, 8, 12, 15],
  333. "counts" : [3, 7, 23, 12, 6]
  334. }
  335. }
  336. PUT metrics_index/_doc/2?refresh
  337. {
  338. "network.name" : "net-2",
  339. "latency_histo" : {
  340. "values" : [1, 6, 8, 12, 14],
  341. "counts" : [8, 17, 8, 7, 6]
  342. }
  343. }
  344. GET metrics_index/_search?size=0&filter_path=aggregations
  345. {
  346. "aggs": {
  347. "latency_ranges": {
  348. "range": {
  349. "field": "latency_histo",
  350. "ranges": [
  351. {"to": 2},
  352. {"from": 2, "to": 3},
  353. {"from": 3, "to": 10},
  354. {"from": 10}
  355. ]
  356. }
  357. }
  358. }
  359. }
  360. ----
  361. The `range` aggregation will sum the counts of each range computed based on the `values` and
  362. return the following output:
  363. [source,console-result]
  364. ----
  365. {
  366. "aggregations": {
  367. "latency_ranges": {
  368. "buckets": [
  369. {
  370. "key": "*-2.0",
  371. "to": 2.0,
  372. "doc_count": 11
  373. },
  374. {
  375. "key": "2.0-3.0",
  376. "from": 2.0,
  377. "to": 3.0,
  378. "doc_count": 0
  379. },
  380. {
  381. "key": "3.0-10.0",
  382. "from": 3.0,
  383. "to": 10.0,
  384. "doc_count": 55
  385. },
  386. {
  387. "key": "10.0-*",
  388. "from": 10.0,
  389. "doc_count": 31
  390. }
  391. ]
  392. }
  393. }
  394. }
  395. ----
  396. // TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
  397. [IMPORTANT]
  398. ========
  399. Range aggregation is a bucket aggregation, which partitions documents into buckets rather than calculating metrics over fields like
  400. metrics aggregations do. Each bucket represents a collection of documents which sub-aggregations can run on.
  401. On the other hand, a histogram field is a pre-aggregated field representing multiple values inside a single field:
  402. buckets of numerical data and a count of items/documents for each bucket. This mismatch between the range aggregations expected input
  403. (expecting raw documents) and the histogram field (that provides summary information) limits the outcome of the aggregation
  404. to only the doc counts for each bucket.
  405. **Consequently, when executing a range aggregation over a histogram field, no sub-aggregations are allowed.**
  406. ========