123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142 |
- [[search-aggregations-matrix-stats-aggregation]]
- === Matrix stats aggregation
- ++++
- <titleabbrev>Matrix stats</titleabbrev>
- ++++
- The `matrix_stats` aggregation is a numeric aggregation that computes the following statistics over a set of document fields:
- [horizontal]
- `count`:: Number of per field samples included in the calculation.
- `mean`:: The average value for each field.
- `variance`:: Per field Measurement for how spread out the samples are from the mean.
- `skewness`:: Per field measurement quantifying the asymmetric distribution around the mean.
- `kurtosis`:: Per field measurement quantifying the shape of the distribution.
- `covariance`:: A matrix that quantitatively describes how changes in one field are associated with another.
- `correlation`:: The covariance matrix scaled to a range of -1 to 1, inclusive. Describes the relationship between field
- distributions.
- IMPORTANT: Unlike other metric aggregations, the `matrix_stats` aggregation does
- not support scripting.
- //////////////////////////
- [source,js]
- --------------------------------------------------
- PUT /statistics/_doc/0
- {"poverty": 24.0, "income": 50000.0}
- PUT /statistics/_doc/1
- {"poverty": 13.0, "income": 95687.0}
- PUT /statistics/_doc/2
- {"poverty": 69.0, "income": 7890.0}
- POST /_refresh
- --------------------------------------------------
- // NOTCONSOLE
- // TESTSETUP
- //////////////////////////
- The following example demonstrates the use of matrix stats to describe the relationship between income and poverty.
- [source,console,id=stats-aggregation-example]
- --------------------------------------------------
- GET /_search
- {
- "aggs": {
- "statistics": {
- "matrix_stats": {
- "fields": [ "poverty", "income" ]
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[s/_search/_search\?filter_path=aggregations/]
- The aggregation type is `matrix_stats` and the `fields` setting defines the set of fields (as an array) for computing
- the statistics. The above request returns the following response:
- [source,console-result]
- --------------------------------------------------
- {
- ...
- "aggregations": {
- "statistics": {
- "doc_count": 50,
- "fields": [ {
- "name": "income",
- "count": 50,
- "mean": 51985.1,
- "variance": 7.383377037755103E7,
- "skewness": 0.5595114003506483,
- "kurtosis": 2.5692365287787124,
- "covariance": {
- "income": 7.383377037755103E7,
- "poverty": -21093.65836734694
- },
- "correlation": {
- "income": 1.0,
- "poverty": -0.8352655256272504
- }
- }, {
- "name": "poverty",
- "count": 50,
- "mean": 12.732000000000001,
- "variance": 8.637730612244896,
- "skewness": 0.4516049811903419,
- "kurtosis": 2.8615929677997767,
- "covariance": {
- "income": -21093.65836734694,
- "poverty": 8.637730612244896
- },
- "correlation": {
- "income": -0.8352655256272504,
- "poverty": 1.0
- }
- } ]
- }
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[s/\.\.\.//]
- // TESTRESPONSE[s/: (\-)?[0-9\.E]+/: $body.$_path/]
- The `doc_count` field indicates the number of documents involved in the computation of the statistics.
- ==== Multi Value Fields
- The `matrix_stats` aggregation treats each document field as an independent sample. The `mode` parameter controls what
- array value the aggregation will use for array or multi-valued fields. This parameter can take one of the following:
- [horizontal]
- `avg`:: (default) Use the average of all values.
- `min`:: Pick the lowest value.
- `max`:: Pick the highest value.
- `sum`:: Use the sum of all values.
- `median`:: Use the median of all values.
- ==== Missing Values
- The `missing` parameter defines how documents that are missing a value should be treated.
- By default they will be ignored but it is also possible to treat them as if they had a value.
- This is done by adding a set of fieldname : value mappings to specify default values per field.
- [source,console,id=stats-aggregation-missing-example]
- --------------------------------------------------
- GET /_search
- {
- "aggs": {
- "matrixstats": {
- "matrix_stats": {
- "fields": [ "poverty", "income" ],
- "missing": { "income": 50000 } <1>
- }
- }
- }
- }
- --------------------------------------------------
- <1> Documents without a value in the `income` field will have the default value `50000`.
|