distance-feature-query.asciidoc 5.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225
  1. [[query-dsl-distance-feature-query]]
  2. === Distance feature query
  3. ++++
  4. <titleabbrev>Distance feature</titleabbrev>
  5. ++++
  6. Boosts the <<relevance-scores,relevance score>> of documents closer to a
  7. provided `origin` date or point. For example, you can use this query to give
  8. more weight to documents closer to a certain date or location.
  9. You can use the `distance_feature` query to find the nearest neighbors to a
  10. location. You can also use the query in a <<query-dsl-bool-query,`bool`>>
  11. search's `should` filter to add boosted relevance scores to the `bool` query's
  12. scores.
  13. [[distance-feature-query-ex-request]]
  14. ==== Example request
  15. [[distance-feature-index-setup]]
  16. ===== Index setup
  17. To use the `distance_feature` query, your index must include a <<date, `date`>>,
  18. <<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>> field.
  19. To see how you can set up an index for the `distance_feature` query, try the
  20. following example.
  21. . Create an `items` index with the following field mapping:
  22. +
  23. --
  24. * `name`, a <<keyword,`keyword`>> field
  25. * `production_date`, a <<date, `date`>> field
  26. * `location`, a <<geo-point,`geo_point`>> field
  27. [source,console]
  28. ----
  29. PUT /items
  30. {
  31. "mappings": {
  32. "properties": {
  33. "name": {
  34. "type": "keyword"
  35. },
  36. "production_date": {
  37. "type": "date"
  38. },
  39. "location": {
  40. "type": "geo_point"
  41. }
  42. }
  43. }
  44. }
  45. ----
  46. // TESTSETUP
  47. --
  48. . Index several documents to this index.
  49. +
  50. --
  51. [source,console]
  52. ----
  53. PUT /items/_doc/1?refresh
  54. {
  55. "name" : "chocolate",
  56. "production_date": "2018-02-01",
  57. "location": [-71.34, 41.12]
  58. }
  59. PUT /items/_doc/2?refresh
  60. {
  61. "name" : "chocolate",
  62. "production_date": "2018-01-01",
  63. "location": [-71.3, 41.15]
  64. }
  65. PUT /items/_doc/3?refresh
  66. {
  67. "name" : "chocolate",
  68. "production_date": "2017-12-01",
  69. "location": [-71.3, 41.12]
  70. }
  71. ----
  72. --
  73. [[distance-feature-query-ex-query]]
  74. ===== Example queries
  75. [[distance-feature-query-date-ex]]
  76. ====== Boost documents based on date
  77. The following `bool` search returns documents with a `name` value of
  78. `chocolate`. The search also uses the `distance_feature` query to increase the
  79. relevance score of documents with a `production_date` value closer to `now`.
  80. [source,console]
  81. ----
  82. GET /items/_search
  83. {
  84. "query": {
  85. "bool": {
  86. "must": {
  87. "match": {
  88. "name": "chocolate"
  89. }
  90. },
  91. "should": {
  92. "distance_feature": {
  93. "field": "production_date",
  94. "pivot": "7d",
  95. "origin": "now"
  96. }
  97. }
  98. }
  99. }
  100. }
  101. ----
  102. [[distance-feature-query-distance-ex]]
  103. ====== Boost documents based on location
  104. The following `bool` search returns documents with a `name` value of
  105. `chocolate`. The search also uses the `distance_feature` query to increase the
  106. relevance score of documents with a `location` value closer to `[-71.3, 41.15]`.
  107. [source,console]
  108. ----
  109. GET /items/_search
  110. {
  111. "query": {
  112. "bool": {
  113. "must": {
  114. "match": {
  115. "name": "chocolate"
  116. }
  117. },
  118. "should": {
  119. "distance_feature": {
  120. "field": "location",
  121. "pivot": "1000m",
  122. "origin": [-71.3, 41.15]
  123. }
  124. }
  125. }
  126. }
  127. }
  128. ----
  129. [[distance-feature-top-level-params]]
  130. ==== Top-level parameters for `distance_feature`
  131. `field`::
  132. (Required, string) Name of the field used to calculate distances. This field
  133. must meet the following criteria:
  134. * Be a <<date, `date`>>, <<date_nanos, `date_nanos`>> or
  135. <<geo-point,`geo_point`>> field
  136. * Have an <<mapping-index,`index`>> mapping parameter value of `true`, which is
  137. the default
  138. * Have an <<doc-values,`doc_values`>> mapping parameter value of `true`, which
  139. is the default
  140. `origin`::
  141. +
  142. --
  143. (Required, string) Date or point of origin used to calculate distances.
  144. If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
  145. field, the `origin` value must be a <<date-format-pattern,date>>.
  146. <<date-math,Date Math>>, such as `now-1h`, is supported.
  147. If the `field` value is a <<geo-point,`geo_point`>> field, the `origin` value
  148. must be a geopoint.
  149. --
  150. `pivot`::
  151. +
  152. --
  153. (Required, <<time-units,time unit>> or <<distance-units,distance unit>>)
  154. Distance from the `origin` at which relevance scores receive half of the `boost`
  155. value.
  156. If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
  157. field, the `pivot` value must be a <<time-units,time unit>>, such as `1h` or
  158. `10d`.
  159. If the `field` value is a <<geo-point,`geo_point`>> field, the `pivot` value
  160. must be a <<distance-units,distance unit>>, such as `1km` or `12m`.
  161. --
  162. `boost`::
  163. +
  164. --
  165. (Optional, float) Floating point number used to multiply the
  166. <<relevance-scores,relevance score>> of matching documents. This value
  167. cannot be negative. Defaults to `1.0`.
  168. --
  169. [[distance-feature-notes]]
  170. ==== Notes
  171. [[distance-feature-calculation]]
  172. ===== How the `distance_feature` query calculates relevance scores
  173. The `distance_feature` query dynamically calculates the distance between the
  174. `origin` value and a document's field values. It then uses this distance as a
  175. feature to boost the <<relevance-scores,relevance score>> of closer
  176. documents.
  177. The `distance_feature` query calculates a document's
  178. <<relevance-scores,relevance score>> as follows:
  179. ```
  180. relevance score = boost * pivot / (pivot + distance)
  181. ```
  182. The `distance` is the absolute difference between the `origin` value and a
  183. document's field value.
  184. [[distance-feature-skip-hits]]
  185. ===== Skip non-competitive hits
  186. Unlike the <<query-dsl-function-score-query,`function_score`>> query or other
  187. ways to change <<relevance-scores,relevance scores>>, the
  188. `distance_feature` query efficiently skips non-competitive hits when the
  189. <<search-uri-request,`track_total_hits`>> parameter is **not** `true`.