| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131 | [[search-aggregations-pipeline-percentiles-bucket-aggregation]]=== Percentiles Bucket AggregationA sibling pipeline aggregation which calculates percentiles across all bucket of a specified metric in a sibling aggregation.The specified metric must be numeric and the sibling aggregation must be a multi-bucket aggregation.==== SyntaxA `percentiles_bucket` aggregation looks like this in isolation:[source,js]--------------------------------------------------{  "percentiles_bucket": {    "buckets_path": "the_sum"  }}--------------------------------------------------// NOTCONSOLE[[percentiles-bucket-params]].`percentiles_bucket` Parameters[options="header"]|===|Parameter Name |Description |Required |Default Value|`buckets_path` |The path to the buckets we wish to find the percentiles for (see <<buckets-path-syntax>> for more details) |Required ||`gap_policy` |The policy to apply when gaps are found in the data (see <<gap-policy>> for more details)|Optional | `skip`|`format` |format to apply to the output value of this aggregation |Optional | `null`|`percents` |The list of percentiles to calculate |Optional | `[ 1, 5, 25, 50, 75, 95, 99 ]`|`keyed` |Flag which returns the range as an hash instead of an array of key-value pairs |Optional | `true`|===The following snippet calculates the percentiles for the total monthly `sales` buckets:[source,console]--------------------------------------------------POST /sales/_search{  "size": 0,  "aggs": {    "sales_per_month": {      "date_histogram": {        "field": "date",        "calendar_interval": "month"      },      "aggs": {        "sales": {          "sum": {            "field": "price"          }        }      }    },    "percentiles_monthly_sales": {      "percentiles_bucket": {        "buckets_path": "sales_per_month>sales", <1>        "percents": [ 25.0, 50.0, 75.0 ]         <2>      }    }  }}--------------------------------------------------// TEST[setup:sales]<1> `buckets_path` instructs this percentiles_bucket aggregation that we want to calculate percentiles forthe `sales` aggregation in the `sales_per_month` date histogram.<2> `percents` specifies which percentiles we wish to calculate, in this case, the 25th, 50th and 75th percentiles.And the following may be the response:[source,console-result]--------------------------------------------------{   "took": 11,   "timed_out": false,   "_shards": ...,   "hits": ...,   "aggregations": {      "sales_per_month": {         "buckets": [            {               "key_as_string": "2015/01/01 00:00:00",               "key": 1420070400000,               "doc_count": 3,               "sales": {                  "value": 550.0               }            },            {               "key_as_string": "2015/02/01 00:00:00",               "key": 1422748800000,               "doc_count": 2,               "sales": {                  "value": 60.0               }            },            {               "key_as_string": "2015/03/01 00:00:00",               "key": 1425168000000,               "doc_count": 2,               "sales": {                  "value": 375.0               }            }         ]      },      "percentiles_monthly_sales": {        "values" : {            "25.0": 375.0,            "50.0": 375.0,            "75.0": 550.0         }      }   }}--------------------------------------------------// TESTRESPONSE[s/"took": 11/"took": $body.took/]// TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]// TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]==== Percentiles_bucket implementationThe Percentile Bucket returns the nearest input data point that is not greater than the requested percentile; it does notinterpolate between data points.The percentiles are calculated exactly and is not an approximation (unlike the Percentiles Metric). This meansthe implementation maintains an in-memory, sorted list of your data to compute the percentiles, before discarding thedata.  You may run into memory pressure issues if you attempt to calculate percentiles over many millions ofdata-points in a single `percentiles_bucket`.
 |