123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220 |
- [[request-body-search-collapse]]
- ==== Field Collapsing
- Allows to collapse search results based on field values.
- The collapsing is done by selecting only the top sorted document per collapse key.
- For instance the query below retrieves the best tweet for each user and sorts them by number of likes.
- [source,console]
- --------------------------------------------------
- GET /twitter/_search
- {
- "query": {
- "match": {
- "message": "elasticsearch"
- }
- },
- "collapse" : {
- "field" : "user" <1>
- },
- "sort": ["likes"], <2>
- "from": 10 <3>
- }
- --------------------------------------------------
- // TEST[setup:twitter]
- <1> collapse the result set using the "user" field
- <2> sort the top docs by number of likes
- <3> define the offset of the first collapsed result
- WARNING: The total number of hits in the response indicates the number of matching documents without collapsing.
- The total number of distinct group is unknown.
- The field used for collapsing must be a single valued <<keyword, `keyword`>> or <<number, `numeric`>> field with <<doc-values, `doc_values`>> activated
- NOTE: The collapsing is applied to the top hits only and does not affect aggregations.
- ===== Expand collapse results
- It is also possible to expand each collapsed top hits with the `inner_hits` option.
- [source,console]
- --------------------------------------------------
- GET /twitter/_search
- {
- "query": {
- "match": {
- "message": "elasticsearch"
- }
- },
- "collapse" : {
- "field" : "user", <1>
- "inner_hits": {
- "name": "last_tweets", <2>
- "size": 5, <3>
- "sort": [{ "date": "asc" }] <4>
- },
- "max_concurrent_group_searches": 4 <5>
- },
- "sort": ["likes"]
- }
- --------------------------------------------------
- // TEST[setup:twitter]
- <1> collapse the result set using the "user" field
- <2> the name used for the inner hit section in the response
- <3> the number of inner_hits to retrieve per collapse key
- <4> how to sort the document inside each group
- <5> the number of concurrent requests allowed to retrieve the inner_hits` per group
- See <<request-body-search-inner-hits, inner hits>> for the complete list of supported options and the format of the response.
- It is also possible to request multiple `inner_hits` for each collapsed hit. This can be useful when you want to get
- multiple representations of the collapsed hits.
- [source,console]
- --------------------------------------------------
- GET /twitter/_search
- {
- "query": {
- "match": {
- "message": "elasticsearch"
- }
- },
- "collapse" : {
- "field" : "user", <1>
- "inner_hits": [
- {
- "name": "most_liked", <2>
- "size": 3,
- "sort": ["likes"]
- },
- {
- "name": "most_recent", <3>
- "size": 3,
- "sort": [{ "date": "asc" }]
- }
- ]
- },
- "sort": ["likes"]
- }
- --------------------------------------------------
- // TEST[setup:twitter]
- <1> collapse the result set using the "user" field
- <2> return the three most liked tweets for the user
- <3> return the three most recent tweets for the user
- The expansion of the group is done by sending an additional query for each
- `inner_hit` request for each collapsed hit returned in the response. This can significantly slow things down
- if you have too many groups and/or `inner_hit` requests.
- The `max_concurrent_group_searches` request parameter can be used to control
- the maximum number of concurrent searches allowed in this phase.
- The default is based on the number of data nodes and the default search thread pool size.
- WARNING: `collapse` cannot be used in conjunction with <<request-body-search-scroll, scroll>>,
- <<request-body-search-rescore, rescore>> or <<request-body-search-search-after, search after>>.
- ===== Second level of collapsing
- Second level of collapsing is also supported and is applied to `inner_hits`.
- For example, the following request finds the top scored tweets for
- each country, and within each country finds the top scored tweets
- for each user.
- [source,js]
- --------------------------------------------------
- GET /twitter/_search
- {
- "query": {
- "match": {
- "message": "elasticsearch"
- }
- },
- "collapse" : {
- "field" : "country",
- "inner_hits" : {
- "name": "by_location",
- "collapse" : {"field" : "user"},
- "size": 3
- }
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- Response:
- [source,js]
- --------------------------------------------------
- {
- ...
- "hits": [
- {
- "_index": "twitter",
- "_type": "_doc",
- "_id": "9",
- "_score": ...,
- "_source": {...},
- "fields": {"country": ["UK"]},
- "inner_hits":{
- "by_location": {
- "hits": {
- ...,
- "hits": [
- {
- ...
- "fields": {"user" : ["user124"]}
- },
- {
- ...
- "fields": {"user" : ["user589"]}
- },
- {
- ...
- "fields": {"user" : ["user001"]}
- }
- ]
- }
- }
- }
- },
- {
- "_index": "twitter",
- "_type": "_doc",
- "_id": "1",
- "_score": ..,
- "_source": {...},
- "fields": {"country": ["Canada"]},
- "inner_hits":{
- "by_location": {
- "hits": {
- ...,
- "hits": [
- {
- ...
- "fields": {"user" : ["user444"]}
- },
- {
- ...
- "fields": {"user" : ["user1111"]}
- },
- {
- ...
- "fields": {"user" : ["user999"]}
- }
- ]
- }
- }
- }
- },
- ....
- ]
- }
- --------------------------------------------------
- // NOTCONSOLE
- NOTE: Second level of collapsing doesn't allow `inner_hits`.
|