normalize-aggregation.asciidoc 5.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186
  1. [role="xpack"]
  2. [[search-aggregations-pipeline-normalize-aggregation]]
  3. === Normalize aggregation
  4. ++++
  5. <titleabbrev>Normalize</titleabbrev>
  6. ++++
  7. A parent pipeline aggregation which calculates the specific normalized/rescaled value for a specific bucket value.
  8. Values that cannot be normalized, will be skipped using the <<gap-policy, skip gap policy>>.
  9. ==== Syntax
  10. A `normalize` aggregation looks like this in isolation:
  11. [source,js]
  12. --------------------------------------------------
  13. {
  14. "normalize": {
  15. "buckets_path": "normalized",
  16. "method": "percent_of_sum"
  17. }
  18. }
  19. --------------------------------------------------
  20. // NOTCONSOLE
  21. [[normalize_pipeline-params]]
  22. .`normalize_pipeline` Parameters
  23. [options="header"]
  24. |===
  25. |Parameter Name |Description |Required |Default Value
  26. |`buckets_path` |The path to the buckets we wish to normalize (see <<buckets-path-syntax, `buckets_path` syntax>> for more details) |Required |
  27. |`method` | The specific <<normalize_pipeline-method, method>> to apply | Required |
  28. |`format` |{javadoc}/java.base/java/text/DecimalFormat.html[DecimalFormat pattern] for the
  29. output value. If specified, the formatted value is returned in the aggregation's
  30. `value_as_string` property |Optional |`null`
  31. |===
  32. ==== Methods
  33. [[normalize_pipeline-method]]
  34. The Normalize Aggregation supports multiple methods to transform the bucket values. Each method definition will use
  35. the following original set of bucket values as examples: `[5, 5, 10, 50, 10, 20]`.
  36. _rescale_0_1_::
  37. This method rescales the data such that the minimum number is zero, and the maximum number is 1, with the rest normalized
  38. linearly in-between.
  39. x' = (x - min_x) / (max_x - min_x)
  40. [0, 0, .1111, 1, .1111, .3333]
  41. _rescale_0_100_::
  42. This method rescales the data such that the minimum number is zero, and the maximum number is 100, with the rest normalized
  43. linearly in-between.
  44. x' = 100 * (x - min_x) / (max_x - min_x)
  45. [0, 0, 11.11, 100, 11.11, 33.33]
  46. _percent_of_sum_::
  47. This method normalizes each value so that it represents a percentage of the total sum it attributes to.
  48. x' = x / sum_x
  49. [5%, 5%, 10%, 50%, 10%, 20%]
  50. _mean_::
  51. This method normalizes such that each value is normalized by how much it differs from the average.
  52. x' = (x - mean_x) / (max_x - min_x)
  53. [4.63, 4.63, 9.63, 49.63, 9.63, 9.63, 19.63]
  54. _z-score_::
  55. This method normalizes such that each value represents how far it is from the mean relative to the standard deviation
  56. x' = (x - mean_x) / stdev_x
  57. [-0.68, -0.68, -0.39, 1.94, -0.39, 0.19]
  58. _softmax_::
  59. This method normalizes such that each value is exponentiated and relative to the sum of the exponents of the original values.
  60. x' = e^x / sum_e_x
  61. [2.862E-20, 2.862E-20, 4.248E-18, 0.999, 9.357E-14, 4.248E-18]
  62. ==== Example
  63. The following snippet calculates the percent of total sales for each month:
  64. [source,console]
  65. --------------------------------------------------
  66. POST /sales/_search
  67. {
  68. "size": 0,
  69. "aggs": {
  70. "sales_per_month": {
  71. "date_histogram": {
  72. "field": "date",
  73. "calendar_interval": "month"
  74. },
  75. "aggs": {
  76. "sales": {
  77. "sum": {
  78. "field": "price"
  79. }
  80. },
  81. "percent_of_total_sales": {
  82. "normalize": {
  83. "buckets_path": "sales", <1>
  84. "method": "percent_of_sum", <2>
  85. "format": "00.00%" <3>
  86. }
  87. }
  88. }
  89. }
  90. }
  91. }
  92. --------------------------------------------------
  93. // TEST[setup:sales]
  94. <1> `buckets_path` instructs this normalize aggregation to use the output of the `sales` aggregation for rescaling
  95. <2> `method` sets which rescaling to apply. In this case, `percent_of_sum` will calculate the sales value as a percent of all sales
  96. in the parent bucket
  97. <3> `format` influences how to format the metric as a string using Java's `DecimalFormat` pattern. In this case, multiplying by 100
  98. and adding a '%'
  99. And the following may be the response:
  100. [source,console-result]
  101. --------------------------------------------------
  102. {
  103. "took": 11,
  104. "timed_out": false,
  105. "_shards": ...,
  106. "hits": ...,
  107. "aggregations": {
  108. "sales_per_month": {
  109. "buckets": [
  110. {
  111. "key_as_string": "2015/01/01 00:00:00",
  112. "key": 1420070400000,
  113. "doc_count": 3,
  114. "sales": {
  115. "value": 550.0
  116. },
  117. "percent_of_total_sales": {
  118. "value": 0.5583756345177665,
  119. "value_as_string": "55.84%"
  120. }
  121. },
  122. {
  123. "key_as_string": "2015/02/01 00:00:00",
  124. "key": 1422748800000,
  125. "doc_count": 2,
  126. "sales": {
  127. "value": 60.0
  128. },
  129. "percent_of_total_sales": {
  130. "value": 0.06091370558375635,
  131. "value_as_string": "06.09%"
  132. }
  133. },
  134. {
  135. "key_as_string": "2015/03/01 00:00:00",
  136. "key": 1425168000000,
  137. "doc_count": 2,
  138. "sales": {
  139. "value": 375.0
  140. },
  141. "percent_of_total_sales": {
  142. "value": 0.38071065989847713,
  143. "value_as_string": "38.07%"
  144. }
  145. }
  146. ]
  147. }
  148. }
  149. }
  150. --------------------------------------------------
  151. // TESTRESPONSE[s/"took": 11/"took": $body.took/]
  152. // TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]
  153. // TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]