top-metrics-aggregation.asciidoc 8.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[search-aggregations-metrics-top-metrics]]
  4. === Top Metrics Aggregation
  5. experimental[We expect to change the response format of this aggregation as we add more features., https://github.com/elastic/elasticsearch/issues/51813]
  6. The `top_metrics` aggregation selects metrics from the document with the largest or smallest "sort"
  7. value. For example, This gets the value of the `v` field on the document with the largest value of `s`:
  8. [source,console,id=search-aggregations-metrics-top-metrics-simple]
  9. ----
  10. POST /test/_bulk?refresh
  11. {"index": {}}
  12. {"s": 1, "v": 3.1415}
  13. {"index": {}}
  14. {"s": 2, "v": 1.0}
  15. {"index": {}}
  16. {"s": 3, "v": 2.71828}
  17. POST /test/_search?filter_path=aggregations
  18. {
  19. "aggs": {
  20. "tm": {
  21. "top_metrics": {
  22. "metrics": {"field": "v"},
  23. "sort": {"s": "desc"}
  24. }
  25. }
  26. }
  27. }
  28. ----
  29. Which returns:
  30. [source,js]
  31. ----
  32. {
  33. "aggregations": {
  34. "tm": {
  35. "top": [ {"sort": [3], "metrics": {"v": 2.718280076980591 } } ]
  36. }
  37. }
  38. }
  39. ----
  40. // TESTRESPONSE
  41. `top_metrics` is fairly similar to <<search-aggregations-metrics-top-hits-aggregation, `top_hits`>>
  42. in spirit but because it is more limited it is able to do its job using less memory and is often
  43. faster.
  44. ==== `sort`
  45. The `sort` field in the metric request functions exactly the same as the `sort` field in the
  46. <<request-body-search-sort, search>> request except:
  47. * It can't be used on <<binary,binary>>, <<flattened,flattened>, <<ip,ip>>,
  48. <<keyword,keyword>>, or <<text,text>> fields.
  49. * It only supports a single sort value.
  50. The metrics that the aggregation returns is the first hit that would be returned by the search
  51. request. So,
  52. `"sort": {"s": "desc"}`:: gets metrics from the document with the highest `s`
  53. `"sort": {"s": "asc"}`:: gets the metrics from the document with the lowest `s`
  54. `"sort": {"_geo_distance": {"location": "35.7796, -78.6382"}}`::
  55. gets metrics from the documents with `location` *closest* to `35.7796, -78.6382`
  56. `"sort": "_score"`:: gets metrics from the document with the highest score
  57. NOTE: This aggregation doesn't support any sort of "tie breaking". If two documents have
  58. the same sort values then this aggregation could return either document's fields.
  59. ==== `metrics`
  60. `metrics` selects the fields to of the "top" document to return. Like most other
  61. aggregations, `top_metrics` casts these values cast to `double` precision
  62. floating point numbers. So they have to be numeric. Dates *work*, but they
  63. come back as a `double` precision floating point containing milliseconds since
  64. epoch. `keyword` fields aren't allowed.
  65. You can return multiple metrics by providing a list:
  66. [source,console,id=search-aggregations-metrics-top-metrics-list-of-metrics]
  67. ----
  68. POST /test/_bulk?refresh
  69. {"index": {}}
  70. {"s": 1, "v": 3.1415, "m": 1.9}
  71. {"index": {}}
  72. {"s": 2, "v": 1.0, "m": 6.7}
  73. {"index": {}}
  74. {"s": 3, "v": 2.71828, "m": -12.2}
  75. POST /test/_search?filter_path=aggregations
  76. {
  77. "aggs": {
  78. "tm": {
  79. "top_metrics": {
  80. "metrics": [
  81. {"field": "v"},
  82. {"field": "m"}
  83. ],
  84. "sort": {"s": "desc"}
  85. }
  86. }
  87. }
  88. }
  89. ----
  90. Which returns:
  91. [source,js]
  92. ----
  93. {
  94. "aggregations": {
  95. "tm": {
  96. "top": [ {
  97. "sort": [3],
  98. "metrics": {
  99. "v": 2.718280076980591,
  100. "m": -12.199999809265137
  101. }
  102. } ]
  103. }
  104. }
  105. }
  106. ----
  107. // TESTRESPONSE
  108. ==== `size`
  109. `top_metrics` can return the top few document's worth of metrics using the size parameter:
  110. [source,console,id=search-aggregations-metrics-top-metrics-size]
  111. ----
  112. POST /test/_bulk?refresh
  113. {"index": {}}
  114. {"s": 1, "v": 3.1415}
  115. {"index": {}}
  116. {"s": 2, "v": 1.0}
  117. {"index": {}}
  118. {"s": 3, "v": 2.71828}
  119. POST /test/_search?filter_path=aggregations
  120. {
  121. "aggs": {
  122. "tm": {
  123. "top_metrics": {
  124. "metrics": {"field": "v"},
  125. "sort": {"s": "desc"},
  126. "size": 2
  127. }
  128. }
  129. }
  130. }
  131. ----
  132. Which returns:
  133. [source,js]
  134. ----
  135. {
  136. "aggregations": {
  137. "tm": {
  138. "top": [
  139. {"sort": [3], "metrics": {"v": 2.718280076980591 } },
  140. {"sort": [2], "metrics": {"v": 1.0 } }
  141. ]
  142. }
  143. }
  144. }
  145. ----
  146. // TESTRESPONSE
  147. The default `size` is 1. The maximum default size is `10` because the aggregation's
  148. working storage is "dense", meaning we allocate `size` slots for every bucket. `10`
  149. is a *very* conservative default maximum and you can raise it if you need to by
  150. changing the `top_metrics_max_size` index setting. But know that large sizes can
  151. take a fair bit of memory, especially if they are inside of an aggregation which
  152. makes many buckes like a large
  153. <<search-aggregations-metrics-top-metrics-example-terms, terms aggregation>>.
  154. [source,console]
  155. ----
  156. PUT /test/_settings
  157. {
  158. "top_metrics_max_size": 100
  159. }
  160. ----
  161. // TEST[continued]
  162. NOTE: If `size` is more than `1` the `top_metrics` aggregation can't be the target of a sort.
  163. ==== Examples
  164. [[search-aggregations-metrics-top-metrics-example-terms]]
  165. ===== Use with terms
  166. This aggregation should be quite useful inside of <<search-aggregations-bucket-terms-aggregation, `terms`>>
  167. aggregation, to, say, find the last value reported by each server.
  168. [source,console,id=search-aggregations-metrics-top-metrics-terms]
  169. ----
  170. PUT /node
  171. {
  172. "mappings": {
  173. "properties": {
  174. "ip": {"type": "ip"},
  175. "date": {"type": "date"}
  176. }
  177. }
  178. }
  179. POST /node/_bulk?refresh
  180. {"index": {}}
  181. {"ip": "192.168.0.1", "date": "2020-01-01T01:01:01", "v": 1}
  182. {"index": {}}
  183. {"ip": "192.168.0.1", "date": "2020-01-01T02:01:01", "v": 2}
  184. {"index": {}}
  185. {"ip": "192.168.0.2", "date": "2020-01-01T02:01:01", "v": 3}
  186. POST /node/_search?filter_path=aggregations
  187. {
  188. "aggs": {
  189. "ip": {
  190. "terms": {
  191. "field": "ip"
  192. },
  193. "aggs": {
  194. "tm": {
  195. "top_metrics": {
  196. "metrics": {"field": "v"},
  197. "sort": {"date": "desc"}
  198. }
  199. }
  200. }
  201. }
  202. }
  203. }
  204. ----
  205. Which returns:
  206. [source,js]
  207. ----
  208. {
  209. "aggregations": {
  210. "ip": {
  211. "buckets": [
  212. {
  213. "key": "192.168.0.1",
  214. "doc_count": 2,
  215. "tm": {
  216. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 2.0 } } ]
  217. }
  218. },
  219. {
  220. "key": "192.168.0.2",
  221. "doc_count": 1,
  222. "tm": {
  223. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 3.0 } } ]
  224. }
  225. }
  226. ],
  227. "doc_count_error_upper_bound": 0,
  228. "sum_other_doc_count": 0
  229. }
  230. }
  231. }
  232. ----
  233. // TESTRESPONSE
  234. Unlike `top_hits`, you can sort buckets by the results of this metric:
  235. [source,console]
  236. ----
  237. POST /node/_search?filter_path=aggregations
  238. {
  239. "aggs": {
  240. "ip": {
  241. "terms": {
  242. "field": "ip",
  243. "order": {"tm.v": "desc"}
  244. },
  245. "aggs": {
  246. "tm": {
  247. "top_metrics": {
  248. "metrics": {"field": "v"},
  249. "sort": {"date": "desc"}
  250. }
  251. }
  252. }
  253. }
  254. }
  255. }
  256. ----
  257. // TEST[continued]
  258. Which returns:
  259. [source,js]
  260. ----
  261. {
  262. "aggregations": {
  263. "ip": {
  264. "buckets": [
  265. {
  266. "key": "192.168.0.2",
  267. "doc_count": 1,
  268. "tm": {
  269. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 3.0 } } ]
  270. }
  271. },
  272. {
  273. "key": "192.168.0.1",
  274. "doc_count": 2,
  275. "tm": {
  276. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 2.0 } } ]
  277. }
  278. }
  279. ],
  280. "doc_count_error_upper_bound": 0,
  281. "sum_other_doc_count": 0
  282. }
  283. }
  284. }
  285. ----
  286. // TESTRESPONSE
  287. ===== Mixed sort types
  288. Sorting `top_metrics` by a field that has different types across different
  289. indices producs somewhat suprising results: floating point fields are
  290. always sorted independantly of whole numbered fields.
  291. [source,console,id=search-aggregations-metrics-top-metrics-mixed-sort]
  292. ----
  293. POST /test/_bulk?refresh
  294. {"index": {"_index": "test1"}}
  295. {"s": 1, "v": 3.1415}
  296. {"index": {"_index": "test1"}}
  297. {"s": 2, "v": 1}
  298. {"index": {"_index": "test2"}}
  299. {"s": 3.1, "v": 2.71828}
  300. POST /test*/_search?filter_path=aggregations
  301. {
  302. "aggs": {
  303. "tm": {
  304. "top_metrics": {
  305. "metrics": {"field": "v"},
  306. "sort": {"s": "asc"}
  307. }
  308. }
  309. }
  310. }
  311. ----
  312. Which returns:
  313. [source,js]
  314. ----
  315. {
  316. "aggregations": {
  317. "tm": {
  318. "top": [ {"sort": [3.0999999046325684], "metrics": {"v": 2.718280076980591 } } ]
  319. }
  320. }
  321. }
  322. ----
  323. // TESTRESPONSE
  324. While this is better than an error it *probably* isn't what you were going for.
  325. While it does lose some precision, you can explictly cast the whole number
  326. fields to floating points with something like:
  327. [source,console]
  328. ----
  329. POST /test*/_search?filter_path=aggregations
  330. {
  331. "aggs": {
  332. "tm": {
  333. "top_metrics": {
  334. "metrics": {"field": "v"},
  335. "sort": {"s": {"order": "asc", "numeric_type": "double"}}
  336. }
  337. }
  338. }
  339. }
  340. ----
  341. // TEST[continued]
  342. Which returns the much more expected:
  343. [source,js]
  344. ----
  345. {
  346. "aggregations": {
  347. "tm": {
  348. "top": [ {"sort": [1.0], "metrics": {"v": 3.1414999961853027 } } ]
  349. }
  350. }
  351. }
  352. ----
  353. // TESTRESPONSE