weighted-avg-aggregation.asciidoc 6.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204
  1. [[search-aggregations-metrics-weight-avg-aggregation]]
  2. === Weighted Avg Aggregation
  3. A `single-value` metrics aggregation that computes the weighted average of numeric values that are extracted from the aggregated documents.
  4. These values can be extracted either from specific numeric fields in the documents, or provided by a script.
  5. When calculating a regular average, each datapoint has an equal "weight" ... it contributes equally to the final value. Weighted averages,
  6. on the other hand, weight each datapoint differently. The amount that each datapoint contributes to the final value is extracted from the
  7. document, or provided by a script.
  8. As a formula, a weighted average is the `∑(value * weight) / ∑(weight)`
  9. A regular average can be thought of as a weighted average where every value has an implicit weight of `1`.
  10. [[weighted-avg-params]]
  11. .`weighted_avg` Parameters
  12. [options="header"]
  13. |===
  14. |Parameter Name |Description |Required |Default Value
  15. |`value` | The configuration for the field or script that provides the values |Required |
  16. |`weight` | The configuration for the field or script that provides the weights |Required |
  17. |`format` | The numeric response formatter |Optional |
  18. |`value_type` | A hint about the values for pure scripts or unmapped fields |Optional |
  19. |===
  20. The `value` and `weight` objects have per-field specific configuration:
  21. [[value-params]]
  22. .`value` Parameters
  23. [options="header"]
  24. |===
  25. |Parameter Name |Description |Required |Default Value
  26. |`field` | The field that values should be extracted from |Required |
  27. |`missing` | A value to use if the field is missing entirely |Optional |
  28. |`script` | A script which provides the values for the document. This is mutually exclusive with `field` |Optional
  29. |===
  30. [[weight-params]]
  31. .`weight` Parameters
  32. [options="header"]
  33. |===
  34. |Parameter Name |Description |Required |Default Value
  35. |`field` | The field that weights should be extracted from |Required |
  36. |`missing` | A weight to use if the field is missing entirely |Optional |
  37. |`script` | A script which provides the weights for the document. This is mutually exclusive with `field` |Optional
  38. |===
  39. ==== Examples
  40. If our documents have a `"grade"` field that holds a 0-100 numeric score, and a `"weight"` field which holds an arbitrary numeric weight,
  41. we can calculate the weighted average using:
  42. [source,console]
  43. --------------------------------------------------
  44. POST /exams/_search
  45. {
  46. "size": 0,
  47. "aggs": {
  48. "weighted_grade": {
  49. "weighted_avg": {
  50. "value": {
  51. "field": "grade"
  52. },
  53. "weight": {
  54. "field": "weight"
  55. }
  56. }
  57. }
  58. }
  59. }
  60. --------------------------------------------------
  61. // TEST[setup:exams]
  62. Which yields a response like:
  63. [source,console-result]
  64. --------------------------------------------------
  65. {
  66. ...
  67. "aggregations": {
  68. "weighted_grade": {
  69. "value": 70.0
  70. }
  71. }
  72. }
  73. --------------------------------------------------
  74. // TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
  75. While multiple values-per-field are allowed, only one weight is allowed. If the aggregation encounters
  76. a document that has more than one weight (e.g. the weight field is a multi-valued field) it will throw an exception.
  77. If you have this situation, you will need to specify a `script` for the weight field, and use the script
  78. to combine the multiple values into a single value to be used.
  79. This single weight will be applied independently to each value extracted from the `value` field.
  80. This example show how a single document with multiple values will be averaged with a single weight:
  81. [source,console]
  82. --------------------------------------------------
  83. POST /exams/_doc?refresh
  84. {
  85. "grade": [1, 2, 3],
  86. "weight": 2
  87. }
  88. POST /exams/_search
  89. {
  90. "size": 0,
  91. "aggs": {
  92. "weighted_grade": {
  93. "weighted_avg": {
  94. "value": {
  95. "field": "grade"
  96. },
  97. "weight": {
  98. "field": "weight"
  99. }
  100. }
  101. }
  102. }
  103. }
  104. --------------------------------------------------
  105. // TEST
  106. The three values (`1`, `2`, and `3`) will be included as independent values, all with the weight of `2`:
  107. [source,console-result]
  108. --------------------------------------------------
  109. {
  110. ...
  111. "aggregations": {
  112. "weighted_grade": {
  113. "value": 2.0
  114. }
  115. }
  116. }
  117. --------------------------------------------------
  118. // TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
  119. The aggregation returns `2.0` as the result, which matches what we would expect when calculating by hand:
  120. `((1*2) + (2*2) + (3*2)) / (2+2+2) == 2`
  121. ==== Script
  122. Both the value and the weight can be derived from a script, instead of a field. As a simple example, the following
  123. will add one to the grade and weight in the document using a script:
  124. [source,console]
  125. --------------------------------------------------
  126. POST /exams/_search
  127. {
  128. "size": 0,
  129. "aggs": {
  130. "weighted_grade": {
  131. "weighted_avg": {
  132. "value": {
  133. "script": "doc.grade.value + 1"
  134. },
  135. "weight": {
  136. "script": "doc.weight.value + 1"
  137. }
  138. }
  139. }
  140. }
  141. }
  142. --------------------------------------------------
  143. // TEST[setup:exams]
  144. ==== Missing values
  145. The `missing` parameter defines how documents that are missing a value should be treated.
  146. The default behavior is different for `value` and `weight`:
  147. By default, if the `value` field is missing the document is ignored and the aggregation moves on to the next document.
  148. If the `weight` field is missing, it is assumed to have a weight of `1` (like a normal average).
  149. Both of these defaults can be overridden with the `missing` parameter:
  150. [source,console]
  151. --------------------------------------------------
  152. POST /exams/_search
  153. {
  154. "size": 0,
  155. "aggs": {
  156. "weighted_grade": {
  157. "weighted_avg": {
  158. "value": {
  159. "field": "grade",
  160. "missing": 2
  161. },
  162. "weight": {
  163. "field": "weight",
  164. "missing": 3
  165. }
  166. }
  167. }
  168. }
  169. }
  170. --------------------------------------------------
  171. // TEST[setup:exams]