top-metrics-aggregation.asciidoc 9.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418
  1. [role="xpack"]
  2. [[search-aggregations-metrics-top-metrics]]
  3. === Top metrics aggregation
  4. ++++
  5. <titleabbrev>Top metrics</titleabbrev>
  6. ++++
  7. The `top_metrics` aggregation selects metrics from the document with the largest or smallest "sort"
  8. value. For example, this gets the value of the `m` field on the document with the largest value of `s`:
  9. [source,console,id=search-aggregations-metrics-top-metrics-simple]
  10. ----
  11. POST /test/_bulk?refresh
  12. {"index": {}}
  13. {"s": 1, "m": 3.1415}
  14. {"index": {}}
  15. {"s": 2, "m": 1.0}
  16. {"index": {}}
  17. {"s": 3, "m": 2.71828}
  18. POST /test/_search?filter_path=aggregations
  19. {
  20. "aggs": {
  21. "tm": {
  22. "top_metrics": {
  23. "metrics": {"field": "m"},
  24. "sort": {"s": "desc"}
  25. }
  26. }
  27. }
  28. }
  29. ----
  30. Which returns:
  31. [source,js]
  32. ----
  33. {
  34. "aggregations": {
  35. "tm": {
  36. "top": [ {"sort": [3], "metrics": {"m": 2.718280076980591 } } ]
  37. }
  38. }
  39. }
  40. ----
  41. // TESTRESPONSE
  42. `top_metrics` is fairly similar to <<search-aggregations-metrics-top-hits-aggregation, `top_hits`>>
  43. in spirit but because it is more limited it is able to do its job using less memory and is often
  44. faster.
  45. ==== `sort`
  46. The `sort` field in the metric request functions exactly the same as the `sort` field in the
  47. <<sort-search-results, search>> request except:
  48. * It can't be used on <<binary,binary>>, <<flattened,flattened>>, <<ip,ip>>,
  49. <<keyword,keyword>>, or <<text,text>> fields.
  50. * It only supports a single sort value so which document wins ties is not specified.
  51. The metrics that the aggregation returns is the first hit that would be returned by the search
  52. request. So,
  53. `"sort": {"s": "desc"}`:: gets metrics from the document with the highest `s`
  54. `"sort": {"s": "asc"}`:: gets the metrics from the document with the lowest `s`
  55. `"sort": {"_geo_distance": {"location": "POINT (-78.6382 35.7796)"}}`::
  56. gets metrics from the documents with `location` *closest* to `35.7796, -78.6382`
  57. `"sort": "_score"`:: gets metrics from the document with the highest score
  58. ==== `metrics`
  59. `metrics` selects the fields of the "top" document to return. You can request
  60. a single metric with something like `"metrics": {"field": "m"}` or multiple
  61. metrics by requesting a list of metrics like `"metrics": [{"field": "m"}, {"field": "i"}`.
  62. `metrics.field` supports the following field types:
  63. * <<boolean,`boolean`>>
  64. * <<ip,`ip`>>
  65. * <<keyword,keywords>>
  66. * <<number,numbers>>
  67. Except for keywords, <<runtime,runtime fields>> for corresponding types are also
  68. supported. `metrics.field` doesn't support fields with <<array,array values>>. A
  69. `top_metric` aggregation on array values may return inconsistent results.
  70. The following example runs a `top_metrics` aggregation on several field types.
  71. [source,console,id=search-aggregations-metrics-top-metrics-list-of-metrics]
  72. ----
  73. PUT /test
  74. {
  75. "mappings": {
  76. "properties": {
  77. "d": {"type": "date"}
  78. }
  79. }
  80. }
  81. POST /test/_bulk?refresh
  82. {"index": {}}
  83. {"s": 1, "m": 3.1415, "i": 1, "d": "2020-01-01T00:12:12Z", "t": "cat"}
  84. {"index": {}}
  85. {"s": 2, "m": 1.0, "i": 6, "d": "2020-01-02T00:12:12Z", "t": "dog"}
  86. {"index": {}}
  87. {"s": 3, "m": 2.71828, "i": -12, "d": "2019-12-31T00:12:12Z", "t": "chicken"}
  88. POST /test/_search?filter_path=aggregations
  89. {
  90. "aggs": {
  91. "tm": {
  92. "top_metrics": {
  93. "metrics": [
  94. {"field": "m"},
  95. {"field": "i"},
  96. {"field": "d"},
  97. {"field": "t.keyword"}
  98. ],
  99. "sort": {"s": "desc"}
  100. }
  101. }
  102. }
  103. }
  104. ----
  105. Which returns:
  106. [source,js]
  107. ----
  108. {
  109. "aggregations": {
  110. "tm": {
  111. "top": [ {
  112. "sort": [3],
  113. "metrics": {
  114. "m": 2.718280076980591,
  115. "i": -12,
  116. "d": "2019-12-31T00:12:12.000Z",
  117. "t.keyword": "chicken"
  118. }
  119. } ]
  120. }
  121. }
  122. }
  123. ----
  124. // TESTRESPONSE
  125. ==== `size`
  126. `top_metrics` can return the top few document's worth of metrics using the size parameter:
  127. [source,console,id=search-aggregations-metrics-top-metrics-size]
  128. ----
  129. POST /test/_bulk?refresh
  130. {"index": {}}
  131. {"s": 1, "m": 3.1415}
  132. {"index": {}}
  133. {"s": 2, "m": 1.0}
  134. {"index": {}}
  135. {"s": 3, "m": 2.71828}
  136. POST /test/_search?filter_path=aggregations
  137. {
  138. "aggs": {
  139. "tm": {
  140. "top_metrics": {
  141. "metrics": {"field": "m"},
  142. "sort": {"s": "desc"},
  143. "size": 3
  144. }
  145. }
  146. }
  147. }
  148. ----
  149. Which returns:
  150. [source,js]
  151. ----
  152. {
  153. "aggregations": {
  154. "tm": {
  155. "top": [
  156. {"sort": [3], "metrics": {"m": 2.718280076980591 } },
  157. {"sort": [2], "metrics": {"m": 1.0 } },
  158. {"sort": [1], "metrics": {"m": 3.1414999961853027 } }
  159. ]
  160. }
  161. }
  162. }
  163. ----
  164. // TESTRESPONSE
  165. The default `size` is 1. The maximum default size is `10` because the aggregation's
  166. working storage is "dense", meaning we allocate `size` slots for every bucket. `10`
  167. is a *very* conservative default maximum and you can raise it if you need to by
  168. changing the `top_metrics_max_size` index setting. But know that large sizes can
  169. take a fair bit of memory, especially if they are inside of an aggregation which
  170. makes many buckes like a large
  171. <<search-aggregations-metrics-top-metrics-example-terms, terms aggregation>>. If
  172. you till want to raise it, use something like:
  173. [source,console]
  174. ----
  175. PUT /test/_settings
  176. {
  177. "top_metrics_max_size": 100
  178. }
  179. ----
  180. // TEST[continued]
  181. NOTE: If `size` is more than `1` the `top_metrics` aggregation can't be the *target* of a sort.
  182. ==== Examples
  183. [[search-aggregations-metrics-top-metrics-example-terms]]
  184. ===== Use with terms
  185. This aggregation should be quite useful inside of <<search-aggregations-bucket-terms-aggregation, `terms`>>
  186. aggregation, to, say, find the last value reported by each server.
  187. [source,console,id=search-aggregations-metrics-top-metrics-terms]
  188. ----
  189. PUT /node
  190. {
  191. "mappings": {
  192. "properties": {
  193. "ip": {"type": "ip"},
  194. "date": {"type": "date"}
  195. }
  196. }
  197. }
  198. POST /node/_bulk?refresh
  199. {"index": {}}
  200. {"ip": "192.168.0.1", "date": "2020-01-01T01:01:01", "m": 1}
  201. {"index": {}}
  202. {"ip": "192.168.0.1", "date": "2020-01-01T02:01:01", "m": 2}
  203. {"index": {}}
  204. {"ip": "192.168.0.2", "date": "2020-01-01T02:01:01", "m": 3}
  205. POST /node/_search?filter_path=aggregations
  206. {
  207. "aggs": {
  208. "ip": {
  209. "terms": {
  210. "field": "ip"
  211. },
  212. "aggs": {
  213. "tm": {
  214. "top_metrics": {
  215. "metrics": {"field": "m"},
  216. "sort": {"date": "desc"}
  217. }
  218. }
  219. }
  220. }
  221. }
  222. }
  223. ----
  224. Which returns:
  225. [source,js]
  226. ----
  227. {
  228. "aggregations": {
  229. "ip": {
  230. "buckets": [
  231. {
  232. "key": "192.168.0.1",
  233. "doc_count": 2,
  234. "tm": {
  235. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"m": 2 } } ]
  236. }
  237. },
  238. {
  239. "key": "192.168.0.2",
  240. "doc_count": 1,
  241. "tm": {
  242. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"m": 3 } } ]
  243. }
  244. }
  245. ],
  246. "doc_count_error_upper_bound": 0,
  247. "sum_other_doc_count": 0
  248. }
  249. }
  250. }
  251. ----
  252. // TESTRESPONSE
  253. Unlike `top_hits`, you can sort buckets by the results of this metric:
  254. [source,console]
  255. ----
  256. POST /node/_search?filter_path=aggregations
  257. {
  258. "aggs": {
  259. "ip": {
  260. "terms": {
  261. "field": "ip",
  262. "order": {"tm.m": "desc"}
  263. },
  264. "aggs": {
  265. "tm": {
  266. "top_metrics": {
  267. "metrics": {"field": "m"},
  268. "sort": {"date": "desc"}
  269. }
  270. }
  271. }
  272. }
  273. }
  274. }
  275. ----
  276. // TEST[continued]
  277. Which returns:
  278. [source,js]
  279. ----
  280. {
  281. "aggregations": {
  282. "ip": {
  283. "buckets": [
  284. {
  285. "key": "192.168.0.2",
  286. "doc_count": 1,
  287. "tm": {
  288. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"m": 3 } } ]
  289. }
  290. },
  291. {
  292. "key": "192.168.0.1",
  293. "doc_count": 2,
  294. "tm": {
  295. "top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"m": 2 } } ]
  296. }
  297. }
  298. ],
  299. "doc_count_error_upper_bound": 0,
  300. "sum_other_doc_count": 0
  301. }
  302. }
  303. }
  304. ----
  305. // TESTRESPONSE
  306. ===== Mixed sort types
  307. Sorting `top_metrics` by a field that has different types across different
  308. indices producs somewhat surprising results: floating point fields are
  309. always sorted independently of whole numbered fields.
  310. [source,console,id=search-aggregations-metrics-top-metrics-mixed-sort]
  311. ----
  312. POST /test/_bulk?refresh
  313. {"index": {"_index": "test1"}}
  314. {"s": 1, "m": 3.1415}
  315. {"index": {"_index": "test1"}}
  316. {"s": 2, "m": 1}
  317. {"index": {"_index": "test2"}}
  318. {"s": 3.1, "m": 2.71828}
  319. POST /test*/_search?filter_path=aggregations
  320. {
  321. "aggs": {
  322. "tm": {
  323. "top_metrics": {
  324. "metrics": {"field": "m"},
  325. "sort": {"s": "asc"}
  326. }
  327. }
  328. }
  329. }
  330. ----
  331. Which returns:
  332. [source,js]
  333. ----
  334. {
  335. "aggregations": {
  336. "tm": {
  337. "top": [ {"sort": [3.0999999046325684], "metrics": {"m": 2.718280076980591 } } ]
  338. }
  339. }
  340. }
  341. ----
  342. // TESTRESPONSE
  343. While this is better than an error it *probably* isn't what you were going for.
  344. While it does lose some precision, you can explicitly cast the whole number
  345. fields to floating points with something like:
  346. [source,console]
  347. ----
  348. POST /test*/_search?filter_path=aggregations
  349. {
  350. "aggs": {
  351. "tm": {
  352. "top_metrics": {
  353. "metrics": {"field": "m"},
  354. "sort": {"s": {"order": "asc", "numeric_type": "double"}}
  355. }
  356. }
  357. }
  358. }
  359. ----
  360. // TEST[continued]
  361. Which returns the much more expected:
  362. [source,js]
  363. ----
  364. {
  365. "aggregations": {
  366. "tm": {
  367. "top": [ {"sort": [1.0], "metrics": {"m": 3.1414999961853027 } } ]
  368. }
  369. }
  370. }
  371. ----
  372. // TESTRESPONSE