top-metrics-aggregation.asciidoc 8.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[search-aggregations-metrics-top-metrics]]
  4. === Top Metrics Aggregation
  5. experimental[We expect to change the response format of this aggregation as we add more features., https://github.com/elastic/elasticsearch/issues/51813]
  6. The `top_metrics` aggregation selects metrics from the document with the largest or smallest "sort"
  7. value. For example, This gets the value of the `v` field on the document with the largest value of `s`:
  8. [source,console,id=search-aggregations-metrics-top-metrics-simple]
  9. ----
  10. POST /test/_bulk?refresh
  11. {"index": {}}
  12. {"s": 1, "v": 3.1415}
  13. {"index": {}}
  14. {"s": 2, "v": 1.0}
  15. {"index": {}}
  16. {"s": 3, "v": 2.71828}
  17. POST /test/_search?filter_path=aggregations
  18. {
  19. "aggs": {
  20. "tm": {
  21. "top_metrics": {
  22. "metrics": {"field": "v"},
  23. "sort": {"s": "desc"}
  24. }
  25. }
  26. }
  27. }
  28. ----
  29. Which returns:
  30. [source,js]
  31. ----
  32. {
  33. "aggregations": {
  34. "tm": {
  35. "top": [ {"sort": [3], "metrics": {"v": 2.718280076980591 } } ]
  36. }
  37. }
  38. }
  39. ----
  40. // TESTRESPONSE
  41. `top_metrics` is fairly similar to <<search-aggregations-metrics-top-hits-aggregation, `top_hits`>>
  42. in spirit but because it is more limited it is able to do its job using less memory and is often
  43. faster.
  44. ==== `sort`
  45. The `sort` field in the metric request functions exactly the same as the `sort` field in the
  46. <<request-body-search-sort, search>> request except:
  47. * It can't be used on <<binary,binary>>, <<flattened,flattened>, <<ip,ip>>,
  48. <<keyword,keyword>>, or <<text,text>> fields.
  49. * It only supports a single sort value.
  50. The metrics that the aggregation returns is the first hit that would be returned by the search
  51. request. So,
  52. `"sort": {"s": "desc"}`:: gets metrics from the document with the highest `s`
  53. `"sort": {"s": "asc"}`:: gets the metrics from the document with the lowest `s`
  54. `"sort": {"_geo_distance": {"location": "35.7796, -78.6382"}}`::
  55. gets metrics from the documents with `location` *closest* to `35.7796, -78.6382`
  56. `"sort": "_score"`:: gets metrics from the document with the highest score
  57. NOTE: This aggregation doesn't support any sort of "tie breaking". If two documents have
  58. the same sort values then this aggregation could return either document's fields.
  59. ==== `metrics`
  60. `metrics` selects the fields to of the "top" document to return.
  61. You can return multiple metrics by providing a list:
  62. [source,console,id=search-aggregations-metrics-top-metrics-list-of-metrics]
  63. ----
  64. PUT /test
  65. {
  66. "mappings": {
  67. "properties": {
  68. "d": {"type": "date"}
  69. }
  70. }
  71. }
  72. POST /test/_bulk?refresh
  73. {"index": {}}
  74. {"s": 1, "v": 3.1415, "m": 1, "d": "2020-01-01T00:12:12Z"}
  75. {"index": {}}
  76. {"s": 2, "v": 1.0, "m": 6, "d": "2020-01-02T00:12:12Z"}
  77. {"index": {}}
  78. {"s": 3, "v": 2.71828, "m": -12, "d": "2019-12-31T00:12:12Z"}
  79. POST /test/_search?filter_path=aggregations
  80. {
  81. "aggs": {
  82. "tm": {
  83. "top_metrics": {
  84. "metrics": [
  85. {"field": "v"},
  86. {"field": "m"},
  87. {"field": "d"}
  88. ],
  89. "sort": {"s": "desc"}
  90. }
  91. }
  92. }
  93. }
  94. ----
  95. Which returns:
  96. [source,js]
  97. ----
  98. {
  99. "aggregations": {
  100. "tm": {
  101. "top": [ {
  102. "sort": [3],
  103. "metrics": {
  104. "v": 2.718280076980591,
  105. "m": -12,
  106. "d": "2019-12-31T00:12:12.000Z"
  107. }
  108. } ]
  109. }
  110. }
  111. }
  112. ----
  113. // TESTRESPONSE
  114. ==== `size`
  115. `top_metrics` can return the top few document's worth of metrics using the size parameter:
  116. [source,console,id=search-aggregations-metrics-top-metrics-size]
  117. ----
  118. POST /test/_bulk?refresh
  119. {"index": {}}
  120. {"s": 1, "v": 3.1415}
  121. {"index": {}}
  122. {"s": 2, "v": 1.0}
  123. {"index": {}}
  124. {"s": 3, "v": 2.71828}
  125. POST /test/_search?filter_path=aggregations
  126. {
  127. "aggs": {
  128. "tm": {
  129. "top_metrics": {
  130. "metrics": {"field": "v"},
  131. "sort": {"s": "desc"},
  132. "size": 2
  133. }
  134. }
  135. }
  136. }
  137. ----
  138. Which returns:
  139. [source,js]
  140. ----
  141. {
  142. "aggregations": {
  143. "tm": {
  144. "top": [
  145. {"sort": [3], "metrics": {"v": 2.718280076980591 } },
  146. {"sort": [2], "metrics": {"v": 1.0 } }
  147. ]
  148. }
  149. }
  150. }
  151. ----
  152. // TESTRESPONSE
  153. The default `size` is 1. The maximum default size is `10` because the aggregation's
  154. working storage is "dense", meaning we allocate `size` slots for every bucket. `10`
  155. is a *very* conservative default maximum and you can raise it if you need to by
  156. changing the `top_metrics_max_size` index setting. But know that large sizes can
  157. take a fair bit of memory, especially if they are inside of an aggregation which
  158. makes many buckes like a large
  159. <<search-aggregations-metrics-top-metrics-example-terms, terms aggregation>>.
  160. [source,console]
  161. ----
  162. PUT /test/_settings
  163. {
  164. "top_metrics_max_size": 100
  165. }
  166. ----
  167. // TEST[continued]
  168. NOTE: If `size` is more than `1` the `top_metrics` aggregation can't be the target of a sort.
  169. ==== Examples
  170. [[search-aggregations-metrics-top-metrics-example-terms]]
  171. ===== Use with terms
  172. This aggregation should be quite useful inside of <<search-aggregations-bucket-terms-aggregation, `terms`>>
  173. aggregation, to, say, find the last value reported by each server.
  174. [source,console,id=search-aggregations-metrics-top-metrics-terms]
  175. ----
  176. PUT /node
  177. {
  178. "mappings": {
  179. "properties": {
  180. "ip": {"type": "ip"},
  181. "date": {"type": "date"}
  182. }
  183. }
  184. }
  185. POST /node/_bulk?refresh
  186. {"index": {}}
  187. {"ip": "192.168.0.1", "date": "2020-01-01T01:01:01", "v": 1}
  188. {"index": {}}
  189. {"ip": "192.168.0.1", "date": "2020-01-01T02:01:01", "v": 2}
  190. {"index": {}}
  191. {"ip": "192.168.0.2", "date": "2020-01-01T02:01:01", "v": 3}
  192. POST /node/_search?filter_path=aggregations
  193. {
  194. "aggs": {
  195. "ip": {
  196. "terms": {
  197. "field": "ip"
  198. },
  199. "aggs": {
  200. "tm": {
  201. "top_metrics": {
  202. "metrics": {"field": "v"},
  203. "sort": {"date": "desc"}
  204. }
  205. }
  206. }
  207. }
  208. }
  209. }
  210. ----
  211. Which returns:
  212. [source,js]
  213. ----
  214. {
  215. "aggregations": {
  216. "ip": {
  217. "buckets": [
  218. {
  219. "key": "192.168.0.1",
  220. "doc_count": 2,
  221. "tm": {
  222. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 2 } } ]
  223. }
  224. },
  225. {
  226. "key": "192.168.0.2",
  227. "doc_count": 1,
  228. "tm": {
  229. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 3 } } ]
  230. }
  231. }
  232. ],
  233. "doc_count_error_upper_bound": 0,
  234. "sum_other_doc_count": 0
  235. }
  236. }
  237. }
  238. ----
  239. // TESTRESPONSE
  240. Unlike `top_hits`, you can sort buckets by the results of this metric:
  241. [source,console]
  242. ----
  243. POST /node/_search?filter_path=aggregations
  244. {
  245. "aggs": {
  246. "ip": {
  247. "terms": {
  248. "field": "ip",
  249. "order": {"tm.v": "desc"}
  250. },
  251. "aggs": {
  252. "tm": {
  253. "top_metrics": {
  254. "metrics": {"field": "v"},
  255. "sort": {"date": "desc"}
  256. }
  257. }
  258. }
  259. }
  260. }
  261. }
  262. ----
  263. // TEST[continued]
  264. Which returns:
  265. [source,js]
  266. ----
  267. {
  268. "aggregations": {
  269. "ip": {
  270. "buckets": [
  271. {
  272. "key": "192.168.0.2",
  273. "doc_count": 1,
  274. "tm": {
  275. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 3 } } ]
  276. }
  277. },
  278. {
  279. "key": "192.168.0.1",
  280. "doc_count": 2,
  281. "tm": {
  282. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 2 } } ]
  283. }
  284. }
  285. ],
  286. "doc_count_error_upper_bound": 0,
  287. "sum_other_doc_count": 0
  288. }
  289. }
  290. }
  291. ----
  292. // TESTRESPONSE
  293. ===== Mixed sort types
  294. Sorting `top_metrics` by a field that has different types across different
  295. indices producs somewhat suprising results: floating point fields are
  296. always sorted independantly of whole numbered fields.
  297. [source,console,id=search-aggregations-metrics-top-metrics-mixed-sort]
  298. ----
  299. POST /test/_bulk?refresh
  300. {"index": {"_index": "test1"}}
  301. {"s": 1, "v": 3.1415}
  302. {"index": {"_index": "test1"}}
  303. {"s": 2, "v": 1}
  304. {"index": {"_index": "test2"}}
  305. {"s": 3.1, "v": 2.71828}
  306. POST /test*/_search?filter_path=aggregations
  307. {
  308. "aggs": {
  309. "tm": {
  310. "top_metrics": {
  311. "metrics": {"field": "v"},
  312. "sort": {"s": "asc"}
  313. }
  314. }
  315. }
  316. }
  317. ----
  318. Which returns:
  319. [source,js]
  320. ----
  321. {
  322. "aggregations": {
  323. "tm": {
  324. "top": [ {"sort": [3.0999999046325684], "metrics": {"v": 2.718280076980591 } } ]
  325. }
  326. }
  327. }
  328. ----
  329. // TESTRESPONSE
  330. While this is better than an error it *probably* isn't what you were going for.
  331. While it does lose some precision, you can explictly cast the whole number
  332. fields to floating points with something like:
  333. [source,console]
  334. ----
  335. POST /test*/_search?filter_path=aggregations
  336. {
  337. "aggs": {
  338. "tm": {
  339. "top_metrics": {
  340. "metrics": {"field": "v"},
  341. "sort": {"s": {"order": "asc", "numeric_type": "double"}}
  342. }
  343. }
  344. }
  345. }
  346. ----
  347. // TEST[continued]
  348. Which returns the much more expected:
  349. [source,js]
  350. ----
  351. {
  352. "aggregations": {
  353. "tm": {
  354. "top": [ {"sort": [1.0], "metrics": {"v": 3.1414999961853027 } } ]
  355. }
  356. }
  357. }
  358. ----
  359. // TESTRESPONSE