123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317 |
- [[search-aggregations-pipeline-derivative-aggregation]]
- === Derivative aggregation
- ++++
- <titleabbrev>Derivative</titleabbrev>
- ++++
- A parent pipeline aggregation which calculates the derivative of a specified metric in a parent histogram (or date_histogram)
- aggregation. The specified metric must be numeric and the enclosing histogram must have `min_doc_count` set to `0` (default
- for `histogram` aggregations).
- ==== Syntax
- A `derivative` aggregation looks like this in isolation:
- [source,js]
- --------------------------------------------------
- "derivative": {
- "buckets_path": "the_sum"
- }
- --------------------------------------------------
- // NOTCONSOLE
- [[derivative-params]]
- .`derivative` Parameters
- [options="header"]
- |===
- |Parameter Name |Description |Required |Default Value
- |`buckets_path` |The path to the buckets we wish to find the derivative for (see <<buckets-path-syntax>> for more
- details) |Required |
- |`gap_policy` |The policy to apply when gaps are found in the data (see <<gap-policy>> for more
- details)|Optional |`skip`
- |`format` |format to apply to the output value of this aggregation |Optional | `null`
- |===
- ==== First Order Derivative
- The following snippet calculates the derivative of the total monthly `sales`:
- [source,console]
- --------------------------------------------------
- POST /sales/_search
- {
- "size": 0,
- "aggs": {
- "sales_per_month": {
- "date_histogram": {
- "field": "date",
- "calendar_interval": "month"
- },
- "aggs": {
- "sales": {
- "sum": {
- "field": "price"
- }
- },
- "sales_deriv": {
- "derivative": {
- "buckets_path": "sales" <1>
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[setup:sales]
- <1> `buckets_path` instructs this derivative aggregation to use the output of the `sales` aggregation for the derivative
- And the following may be the response:
- [source,console-result]
- --------------------------------------------------
- {
- "took": 11,
- "timed_out": false,
- "_shards": ...,
- "hits": ...,
- "aggregations": {
- "sales_per_month": {
- "buckets": [
- {
- "key_as_string": "2015/01/01 00:00:00",
- "key": 1420070400000,
- "doc_count": 3,
- "sales": {
- "value": 550.0
- } <1>
- },
- {
- "key_as_string": "2015/02/01 00:00:00",
- "key": 1422748800000,
- "doc_count": 2,
- "sales": {
- "value": 60.0
- },
- "sales_deriv": {
- "value": -490.0 <2>
- }
- },
- {
- "key_as_string": "2015/03/01 00:00:00",
- "key": 1425168000000,
- "doc_count": 2, <3>
- "sales": {
- "value": 375.0
- },
- "sales_deriv": {
- "value": 315.0
- }
- }
- ]
- }
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[s/"took": 11/"took": $body.took/]
- // TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]
- // TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]
- <1> No derivative for the first bucket since we need at least 2 data points to calculate the derivative
- <2> Derivative value units are implicitly defined by the `sales` aggregation and the parent histogram so in this case the units
- would be $/month assuming the `price` field has units of $.
- <3> The number of documents in the bucket are represented by the `doc_count`
- ==== Second Order Derivative
- A second order derivative can be calculated by chaining the derivative pipeline aggregation onto the result of another derivative
- pipeline aggregation as in the following example which will calculate both the first and the second order derivative of the total
- monthly sales:
- [source,console]
- --------------------------------------------------
- POST /sales/_search
- {
- "size": 0,
- "aggs": {
- "sales_per_month": {
- "date_histogram": {
- "field": "date",
- "calendar_interval": "month"
- },
- "aggs": {
- "sales": {
- "sum": {
- "field": "price"
- }
- },
- "sales_deriv": {
- "derivative": {
- "buckets_path": "sales"
- }
- },
- "sales_2nd_deriv": {
- "derivative": {
- "buckets_path": "sales_deriv" <1>
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[setup:sales]
- <1> `buckets_path` for the second derivative points to the name of the first derivative
- And the following may be the response:
- [source,console-result]
- --------------------------------------------------
- {
- "took": 50,
- "timed_out": false,
- "_shards": ...,
- "hits": ...,
- "aggregations": {
- "sales_per_month": {
- "buckets": [
- {
- "key_as_string": "2015/01/01 00:00:00",
- "key": 1420070400000,
- "doc_count": 3,
- "sales": {
- "value": 550.0
- } <1>
- },
- {
- "key_as_string": "2015/02/01 00:00:00",
- "key": 1422748800000,
- "doc_count": 2,
- "sales": {
- "value": 60.0
- },
- "sales_deriv": {
- "value": -490.0
- } <1>
- },
- {
- "key_as_string": "2015/03/01 00:00:00",
- "key": 1425168000000,
- "doc_count": 2,
- "sales": {
- "value": 375.0
- },
- "sales_deriv": {
- "value": 315.0
- },
- "sales_2nd_deriv": {
- "value": 805.0
- }
- }
- ]
- }
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[s/"took": 50/"took": $body.took/]
- // TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]
- // TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]
- <1> No second derivative for the first two buckets since we need at least 2 data points from the first derivative to calculate the
- second derivative
- ==== Units
- The derivative aggregation allows the units of the derivative values to be specified. This returns an extra field in the response
- `normalized_value` which reports the derivative value in the desired x-axis units. In the below example we calculate the derivative
- of the total sales per month but ask for the derivative of the sales as in the units of sales per day:
- [source,console]
- --------------------------------------------------
- POST /sales/_search
- {
- "size": 0,
- "aggs": {
- "sales_per_month": {
- "date_histogram": {
- "field": "date",
- "calendar_interval": "month"
- },
- "aggs": {
- "sales": {
- "sum": {
- "field": "price"
- }
- },
- "sales_deriv": {
- "derivative": {
- "buckets_path": "sales",
- "unit": "day" <1>
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[setup:sales]
- <1> `unit` specifies what unit to use for the x-axis of the derivative calculation
- And the following may be the response:
- [source,console-result]
- --------------------------------------------------
- {
- "took": 50,
- "timed_out": false,
- "_shards": ...,
- "hits": ...,
- "aggregations": {
- "sales_per_month": {
- "buckets": [
- {
- "key_as_string": "2015/01/01 00:00:00",
- "key": 1420070400000,
- "doc_count": 3,
- "sales": {
- "value": 550.0
- } <1>
- },
- {
- "key_as_string": "2015/02/01 00:00:00",
- "key": 1422748800000,
- "doc_count": 2,
- "sales": {
- "value": 60.0
- },
- "sales_deriv": {
- "value": -490.0, <1>
- "normalized_value": -15.806451612903226 <2>
- }
- },
- {
- "key_as_string": "2015/03/01 00:00:00",
- "key": 1425168000000,
- "doc_count": 2,
- "sales": {
- "value": 375.0
- },
- "sales_deriv": {
- "value": 315.0,
- "normalized_value": 11.25
- }
- }
- ]
- }
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[s/"took": 50/"took": $body.took/]
- // TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]
- // TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]
- <1> `value` is reported in the original units of 'per month'
- <2> `normalized_value` is reported in the desired units of 'per day'
|