collapse.asciidoc 6.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220
  1. [[request-body-search-collapse]]
  2. ==== Field Collapsing
  3. Allows to collapse search results based on field values.
  4. The collapsing is done by selecting only the top sorted document per collapse key.
  5. For instance the query below retrieves the best tweet for each user and sorts them by number of likes.
  6. [source,console]
  7. --------------------------------------------------
  8. GET /twitter/_search
  9. {
  10. "query": {
  11. "match": {
  12. "message": "elasticsearch"
  13. }
  14. },
  15. "collapse" : {
  16. "field" : "user" <1>
  17. },
  18. "sort": ["likes"], <2>
  19. "from": 10 <3>
  20. }
  21. --------------------------------------------------
  22. // TEST[setup:twitter]
  23. <1> collapse the result set using the "user" field
  24. <2> sort the top docs by number of likes
  25. <3> define the offset of the first collapsed result
  26. WARNING: The total number of hits in the response indicates the number of matching documents without collapsing.
  27. The total number of distinct group is unknown.
  28. The field used for collapsing must be a single valued <<keyword, `keyword`>> or <<number, `numeric`>> field with <<doc-values, `doc_values`>> activated
  29. NOTE: The collapsing is applied to the top hits only and does not affect aggregations.
  30. ===== Expand collapse results
  31. It is also possible to expand each collapsed top hits with the `inner_hits` option.
  32. [source,console]
  33. --------------------------------------------------
  34. GET /twitter/_search
  35. {
  36. "query": {
  37. "match": {
  38. "message": "elasticsearch"
  39. }
  40. },
  41. "collapse" : {
  42. "field" : "user", <1>
  43. "inner_hits": {
  44. "name": "last_tweets", <2>
  45. "size": 5, <3>
  46. "sort": [{ "date": "asc" }] <4>
  47. },
  48. "max_concurrent_group_searches": 4 <5>
  49. },
  50. "sort": ["likes"]
  51. }
  52. --------------------------------------------------
  53. // TEST[setup:twitter]
  54. <1> collapse the result set using the "user" field
  55. <2> the name used for the inner hit section in the response
  56. <3> the number of inner_hits to retrieve per collapse key
  57. <4> how to sort the document inside each group
  58. <5> the number of concurrent requests allowed to retrieve the inner_hits` per group
  59. See <<request-body-search-inner-hits, inner hits>> for the complete list of supported options and the format of the response.
  60. It is also possible to request multiple `inner_hits` for each collapsed hit. This can be useful when you want to get
  61. multiple representations of the collapsed hits.
  62. [source,console]
  63. --------------------------------------------------
  64. GET /twitter/_search
  65. {
  66. "query": {
  67. "match": {
  68. "message": "elasticsearch"
  69. }
  70. },
  71. "collapse" : {
  72. "field" : "user", <1>
  73. "inner_hits": [
  74. {
  75. "name": "most_liked", <2>
  76. "size": 3,
  77. "sort": ["likes"]
  78. },
  79. {
  80. "name": "most_recent", <3>
  81. "size": 3,
  82. "sort": [{ "date": "asc" }]
  83. }
  84. ]
  85. },
  86. "sort": ["likes"]
  87. }
  88. --------------------------------------------------
  89. // TEST[setup:twitter]
  90. <1> collapse the result set using the "user" field
  91. <2> return the three most liked tweets for the user
  92. <3> return the three most recent tweets for the user
  93. The expansion of the group is done by sending an additional query for each
  94. `inner_hit` request for each collapsed hit returned in the response. This can significantly slow things down
  95. if you have too many groups and/or `inner_hit` requests.
  96. The `max_concurrent_group_searches` request parameter can be used to control
  97. the maximum number of concurrent searches allowed in this phase.
  98. The default is based on the number of data nodes and the default search thread pool size.
  99. WARNING: `collapse` cannot be used in conjunction with <<request-body-search-scroll, scroll>>,
  100. <<request-body-search-rescore, rescore>> or <<request-body-search-search-after, search after>>.
  101. ===== Second level of collapsing
  102. Second level of collapsing is also supported and is applied to `inner_hits`.
  103. For example, the following request finds the top scored tweets for
  104. each country, and within each country finds the top scored tweets
  105. for each user.
  106. [source,js]
  107. --------------------------------------------------
  108. GET /twitter/_search
  109. {
  110. "query": {
  111. "match": {
  112. "message": "elasticsearch"
  113. }
  114. },
  115. "collapse" : {
  116. "field" : "country",
  117. "inner_hits" : {
  118. "name": "by_location",
  119. "collapse" : {"field" : "user"},
  120. "size": 3
  121. }
  122. }
  123. }
  124. --------------------------------------------------
  125. // NOTCONSOLE
  126. Response:
  127. [source,js]
  128. --------------------------------------------------
  129. {
  130. ...
  131. "hits": [
  132. {
  133. "_index": "twitter",
  134. "_type": "_doc",
  135. "_id": "9",
  136. "_score": ...,
  137. "_source": {...},
  138. "fields": {"country": ["UK"]},
  139. "inner_hits":{
  140. "by_location": {
  141. "hits": {
  142. ...,
  143. "hits": [
  144. {
  145. ...
  146. "fields": {"user" : ["user124"]}
  147. },
  148. {
  149. ...
  150. "fields": {"user" : ["user589"]}
  151. },
  152. {
  153. ...
  154. "fields": {"user" : ["user001"]}
  155. }
  156. ]
  157. }
  158. }
  159. }
  160. },
  161. {
  162. "_index": "twitter",
  163. "_type": "_doc",
  164. "_id": "1",
  165. "_score": ..,
  166. "_source": {...},
  167. "fields": {"country": ["Canada"]},
  168. "inner_hits":{
  169. "by_location": {
  170. "hits": {
  171. ...,
  172. "hits": [
  173. {
  174. ...
  175. "fields": {"user" : ["user444"]}
  176. },
  177. {
  178. ...
  179. "fields": {"user" : ["user1111"]}
  180. },
  181. {
  182. ...
  183. "fields": {"user" : ["user999"]}
  184. }
  185. ]
  186. }
  187. }
  188. }
  189. },
  190. ....
  191. ]
  192. }
  193. --------------------------------------------------
  194. // NOTCONSOLE
  195. NOTE: Second level of collapsing doesn't allow `inner_hits`.