| 12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576 | [[search-aggregations-bucket-sampler-aggregation]]=== Sampler Aggregationexperimental[]A filtering aggregation used to limit any sub aggregations' processing to a sample of the top-scoring documents..Example use cases:* Tightening the focus of analytics to high-relevance matches rather than the potentially very long tail of low-quality matches* Reducing the running cost of aggregations that can produce useful results using only samples e.g. `significant_terms` Example:[source,js]--------------------------------------------------{    "query": {        "match": {            "text": "iphone"        }    },    "aggs": {        "sample": {            "sampler": {                "shard_size": 200            },            "aggs": {                "keywords": {                    "significant_terms": {                        "field": "text"                    }                }            }        }    }}--------------------------------------------------Response:[source,js]--------------------------------------------------{    ...        "aggregations": {        "sample": {            "doc_count": 1000,<1>            "keywords": {                "doc_count": 1000,                "buckets": [                    ...                    {                        "key": "bend",                        "doc_count": 58,                        "score": 37.982536582524276,                        "bg_count": 103                    },                    ....}--------------------------------------------------<1> 1000 documents were sampled in total because we asked for a maximum of 200 from an index with 5 shards. The cost of performing the nested significant_terms aggregation was therefore limited rather than unbounded.==== shard_sizeThe `shard_size` parameter limits how many top-scoring documents are collected in the sample processed on each shard.The default value is 100.==== Limitations===== Cannot be nested under `breadth_first` aggregationsBeing a quality-based filter the sampler aggregation needs access to the relevance score produced for each document.It therefore cannot be nested under a `terms` aggregation which has the `collect_mode` switched from the default `depth_first` mode to `breadth_first` as this discards scores.In this situation an error will be thrown.
 |