123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421 |
- [[search-aggregations]]
- = Aggregations
- [partintro]
- --
- An aggregation summarizes your data as metrics, statistics, or other analytics.
- Aggregations help you answer questions like:
- * What's the average load time for my website?
- * Who are my most valuable customers based on transaction volume?
- * What would be considered a large file on my network?
- * How many products are in each product category?
- {es} organizes aggregations into three categories:
- * <<search-aggregations-metrics,Metric>> aggregations that calculate metrics,
- such as a sum or average, from field values.
- * <<search-aggregations-bucket,Bucket>> aggregations that
- group documents into buckets, also called bins, based on field values, ranges,
- or other criteria.
- * <<search-aggregations-pipeline,Pipeline>> aggregations that take input from
- other aggregations instead of documents or fields.
- [discrete]
- [[run-an-agg]]
- === Run an aggregation
- You can run aggregations as part of a <<search-your-data,search>> by specifying the <<search-search,search API>>'s `aggs` parameter. The
- following search runs a
- <<search-aggregations-bucket-terms-aggregation,terms aggregation>> on
- `my-field`:
- [source,console]
- ----
- GET /my-index-000001/_search
- {
- "aggs": {
- "my-agg-name": {
- "terms": {
- "field": "my-field"
- }
- }
- }
- }
- ----
- // TEST[setup:my_index]
- // TEST[s/my-field/http.request.method/]
- Aggregation results are in the response's `aggregations` object:
- [source,console-result]
- ----
- {
- "took": 78,
- "timed_out": false,
- "_shards": {
- "total": 1,
- "successful": 1,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": {
- "value": 5,
- "relation": "eq"
- },
- "max_score": 1.0,
- "hits": [...]
- },
- "aggregations": {
- "my-agg-name": { <1>
- "doc_count_error_upper_bound": 0,
- "sum_other_doc_count": 0,
- "buckets": []
- }
- }
- }
- ----
- // TESTRESPONSE[s/"took": 78/"took": "$body.took"/]
- // TESTRESPONSE[s/\.\.\.$/"took": "$body.took", "timed_out": false, "_shards": "$body._shards", /]
- // TESTRESPONSE[s/"hits": \[\.\.\.\]/"hits": "$body.hits.hits"/]
- // TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":"get","doc_count":5\}\]/]
- <1> Results for the `my-agg-name` aggregation.
- [discrete]
- [[change-agg-scope]]
- === Change an aggregation's scope
- Use the `query` parameter to limit the documents on which an aggregation runs:
- [source,console]
- ----
- GET /my-index-000001/_search
- {
- "query": {
- "range": {
- "@timestamp": {
- "gte": "now-1d/d",
- "lt": "now/d"
- }
- }
- },
- "aggs": {
- "my-agg-name": {
- "terms": {
- "field": "my-field"
- }
- }
- }
- }
- ----
- // TEST[setup:my_index]
- // TEST[s/my-field/http.request.method/]
- [discrete]
- [[return-only-agg-results]]
- === Return only aggregation results
- By default, searches containing an aggregation return both search hits and
- aggregation results. To return only aggregation results, set `size` to `0`:
- [source,console]
- ----
- GET /my-index-000001/_search
- {
- "size": 0,
- "aggs": {
- "my-agg-name": {
- "terms": {
- "field": "my-field"
- }
- }
- }
- }
- ----
- // TEST[setup:my_index]
- // TEST[s/my-field/http.request.method/]
- [discrete]
- [[run-multiple-aggs]]
- === Run multiple aggregations
- You can specify multiple aggregations in the same request:
- [source,console]
- ----
- GET /my-index-000001/_search
- {
- "aggs": {
- "my-first-agg-name": {
- "terms": {
- "field": "my-field"
- }
- },
- "my-second-agg-name": {
- "avg": {
- "field": "my-other-field"
- }
- }
- }
- }
- ----
- // TEST[setup:my_index]
- // TEST[s/my-field/http.request.method/]
- // TEST[s/my-other-field/http.response.bytes/]
- [discrete]
- [[run-sub-aggs]]
- === Run sub-aggregations
- Bucket aggregations support bucket or metric sub-aggregations. For example, a
- terms aggregation with an <<search-aggregations-metrics-avg-aggregation,avg>>
- sub-aggregation calculates an average value for each bucket of documents. There
- is no level or depth limit for nesting sub-aggregations.
- [source,console]
- ----
- GET /my-index-000001/_search
- {
- "aggs": {
- "my-agg-name": {
- "terms": {
- "field": "my-field"
- },
- "aggs": {
- "my-sub-agg-name": {
- "avg": {
- "field": "my-other-field"
- }
- }
- }
- }
- }
- }
- ----
- // TEST[setup:my_index]
- // TEST[s/_search/_search?size=0/]
- // TEST[s/my-field/http.request.method/]
- // TEST[s/my-other-field/http.response.bytes/]
- The response nests sub-aggregation results under their parent aggregation:
- [source,console-result]
- ----
- {
- ...
- "aggregations": {
- "my-agg-name": { <1>
- "doc_count_error_upper_bound": 0,
- "sum_other_doc_count": 0,
- "buckets": [
- {
- "key": "foo",
- "doc_count": 5,
- "my-sub-agg-name": { <2>
- "value": 75.0
- }
- }
- ]
- }
- }
- }
- ----
- // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
- // TESTRESPONSE[s/"key": "foo"/"key": "get"/]
- // TESTRESPONSE[s/"value": 75.0/"value": $body.aggregations.my-agg-name.buckets.0.my-sub-agg-name.value/]
- <1> Results for the parent aggregation, `my-agg-name`.
- <2> Results for `my-agg-name`'s sub-aggregation, `my-sub-agg-name`.
- [discrete]
- [[add-metadata-to-an-agg]]
- === Add custom metadata
- Use the `meta` object to associate custom metadata with an aggregation:
- [source,console]
- ----
- GET /my-index-000001/_search
- {
- "aggs": {
- "my-agg-name": {
- "terms": {
- "field": "my-field"
- },
- "meta": {
- "my-metadata-field": "foo"
- }
- }
- }
- }
- ----
- // TEST[setup:my_index]
- // TEST[s/_search/_search?size=0/]
- The response returns the `meta` object in place:
- [source,console-result]
- ----
- {
- ...
- "aggregations": {
- "my-agg-name": {
- "meta": {
- "my-metadata-field": "foo"
- },
- "doc_count_error_upper_bound": 0,
- "sum_other_doc_count": 0,
- "buckets": []
- }
- }
- }
- ----
- // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
- [discrete]
- [[return-agg-type]]
- === Return the aggregation type
- By default, aggregation results include the aggregation's name but not its type.
- To return the aggregation type, use the `typed_keys` query parameter.
- [source,console]
- ----
- GET /my-index-000001/_search?typed_keys
- {
- "aggs": {
- "my-agg-name": {
- "histogram": {
- "field": "my-field",
- "interval": 1000
- }
- }
- }
- }
- ----
- // TEST[setup:my_index]
- // TEST[s/typed_keys/typed_keys&size=0/]
- // TEST[s/my-field/http.response.bytes/]
- The response returns the aggregation type as a prefix to the aggregation's name.
- IMPORTANT: Some aggregations return a different aggregation type from the
- type in the request. For example, the terms,
- <<search-aggregations-bucket-significantterms-aggregation,significant terms>>,
- and <<search-aggregations-metrics-percentile-aggregation,percentiles>>
- aggregations return different aggregations types depending on the data type of
- the aggregated field.
- [source,console-result]
- ----
- {
- ...
- "aggregations": {
- "histogram#my-agg-name": { <1>
- "buckets": []
- }
- }
- }
- ----
- // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
- // TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":1070000.0,"doc_count":5\}\]/]
- <1> The aggregation type, `histogram`, followed by a `#` separator and the aggregation's name, `my-agg-name`.
- [discrete]
- [[use-scripts-in-an-agg]]
- === Use scripts in an aggregation
- When a field doesn't exactly match the aggregation you need, you
- should aggregate on a <<runtime,runtime field>>:
- [source,console]
- ----
- GET /my-index-000001/_search?size=0
- {
- "runtime_mappings": {
- "message.length": {
- "type": "long",
- "script": "emit(doc['message.keyword'].value.length())"
- }
- },
- "aggs": {
- "message_length": {
- "histogram": {
- "interval": 10,
- "field": "message.length"
- }
- }
- }
- }
- ----
- // TEST[setup:my_index]
- ////
- [source,console-result]
- ----
- {
- "timed_out": false,
- "took": "$body.took",
- "_shards": {
- "total": 1,
- "successful": 1,
- "failed": 0,
- "skipped": 0
- },
- "hits": "$body.hits",
- "aggregations": {
- "message_length": {
- "buckets": [
- {
- "key": 30.0,
- "doc_count": 5
- }
- ]
- }
- }
- }
- ----
- ////
- Scripts calculate field values dynamically, which adds a little
- overhead to the aggregation. In addition to the time spent calculating,
- some aggregations like <<search-aggregations-bucket-terms-aggregation,`terms`>>
- and <<search-aggregations-bucket-filters-aggregation,`filters`>> can't use
- some of their optimizations with runtime fields. In total, performance costs
- for using a runtime field varies from aggregation to aggregation.
- // TODO when we have calculated fields we can link to them here.
- [discrete]
- [[agg-caches]]
- === Aggregation caches
- For faster responses, {es} caches the results of frequently run aggregations in
- the <<shard-request-cache,shard request cache>>. To get cached results, use the
- same <<shard-and-node-preference,`preference` string>> for each search. If you
- don't need search hits, <<return-only-agg-results,set `size` to `0`>> to avoid
- filling the cache.
- {es} routes searches with the same preference string to the same shards. If the
- shards' data doesn’t change between searches, the shards return cached
- aggregation results.
- [discrete]
- [[limits-for-long-values]]
- === Limits for `long` values
- When running aggregations, {es} uses <<number,`double`>> values to hold and
- represent numeric data. As a result, aggregations on <<number,`long`>> numbers
- greater than +2^53^+ are approximate.
- --
- include::aggregations/bucket.asciidoc[]
- include::aggregations/metrics.asciidoc[]
- include::aggregations/pipeline.asciidoc[]
|