123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196 |
- [[search-aggregations-metrics-top-hits-aggregation]]
- === Top hits Aggregation
- A `top_hits` metric aggregator keeps track of the most relevant document being aggregated. This aggregator is intended
- to be used as a sub aggregator, so that the top matching documents can be aggregated per bucket.
- The `top_hits` aggregator can effectively be used to group result sets by certain fields via a bucket aggregator.
- One or more bucket aggregators determines by which properties a result set get sliced into.
- ==== Options
- * `from` - The offset from the first result you want to fetch.
- * `size` - The maximum number of top matching hits to return per bucket. By default the top three matching hits are returned.
- * `sort` - How the top matching hits should be sorted. By default the hits are sorted by the score of the main query.
- ==== Supported per hit features
- The top_hits aggregation returns regular search hits, because of this many per hit features can be supported:
- * <<search-request-highlighting,Highlighting>>
- * <<search-request-explain,Explain>>
- * <<search-request-named-queries-and-filters,Named filters and queries>>
- * <<search-request-source-filtering,Source filtering>>
- * <<search-request-script-fields,Script fields>>
- * <<search-request-fielddata-fields,Fielddata fields>>
- * <<search-request-version,Include versions>>
- ==== Example
- In the following example we group the questions by tag and per tag we show the last active question. For each question
- only the title field is being included in the source.
- [source,js]
- --------------------------------------------------
- {
- "aggs": {
- "top-tags": {
- "terms": {
- "field": "tags",
- "size": 3
- },
- "aggs": {
- "top_tag_hits": {
- "top_hits": {
- "sort": [
- {
- "last_activity_date": {
- "order": "desc"
- }
- }
- ],
- "_source": {
- "include": [
- "title"
- ]
- },
- "size" : 1
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- Possible response snippet:
- [source,js]
- --------------------------------------------------
- "aggregations": {
- "top-tags": {
- "buckets": [
- {
- "key": "windows-7",
- "doc_count": 25365,
- "top_tags_hits": {
- "hits": {
- "total": 25365,
- "max_score": 1,
- "hits": [
- {
- "_index": "stack",
- "_type": "question",
- "_id": "602679",
- "_score": 1,
- "_source": {
- "title": "Windows port opening"
- },
- "sort": [
- 1370143231177
- ]
- }
- ]
- }
- }
- },
- {
- "key": "linux",
- "doc_count": 18342,
- "top_tags_hits": {
- "hits": {
- "total": 18342,
- "max_score": 1,
- "hits": [
- {
- "_index": "stack",
- "_type": "question",
- "_id": "602672",
- "_score": 1,
- "_source": {
- "title": "Ubuntu RFID Screensaver lock-unlock"
- },
- "sort": [
- 1370143379747
- ]
- }
- ]
- }
- }
- },
- {
- "key": "windows",
- "doc_count": 18119,
- "top_tags_hits": {
- "hits": {
- "total": 18119,
- "max_score": 1,
- "hits": [
- {
- "_index": "stack",
- "_type": "question",
- "_id": "602678",
- "_score": 1,
- "_source": {
- "title": "If I change my computers date / time, what could be affected?"
- },
- "sort": [
- 1370142868283
- ]
- }
- ]
- }
- }
- }
- ]
- }
- }
- --------------------------------------------------
- ==== Field collapse example
- Field collapsing or result grouping is a feature that logically groups a result set into groups and per group returns
- top documents. The ordering of the groups is determined by the relevancy of the first document in a group. In
- Elasticsearch this can be implemented via a bucket aggregator that wraps a `top_hits` aggregator as sub-aggregator.
- In the example below we search across crawled webpages. For each webpage we store the body and the domain the webpage
- belong to. By defining a `terms` aggregator on the `domain` field we group the result set of webpages by domain. The
- `top_docs` aggregator is then defined as sub-aggregator, so that the top matching hits are collected per bucket.
- Also a `max` aggregator is defined which is used by the `terms` aggregator's order feature the return the buckets by
- relevancy order of the most relevant document in a bucket.
- [source,js]
- --------------------------------------------------
- {
- "query": {
- "match": {
- "body": "elections"
- }
- },
- "aggs": {
- "top-sites": {
- "terms": {
- "field": "domain",
- "order": {
- "top_hit": "desc"
- }
- },
- "aggs": {
- "top_tags_hits": {
- "top_hits": {}
- },
- "top_hit" : {
- "max": {
- "script": "_score"
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- At the moment the `max` (or `min`) aggregator is needed to make sure the buckets from the `terms` aggregator are
- ordered according to the score of the most relevant webpage per domain. The `top_hits` aggregator isn't a metric aggregator
- and therefore can't be used in the `order` option of the `terms` aggregator.
|