distance-feature-query.asciidoc 4.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180
  1. [[query-dsl-distance-feature-query]]
  2. === Distance feature query
  3. ++++
  4. <titleabbrev>Distance feature</titleabbrev>
  5. ++++
  6. The `distance_feature` query is a specialized query that only works
  7. on <<date, `date`>>, <<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>>
  8. fields. Its goal is to boost documents' scores based on proximity
  9. to some given origin. For example, use this query if you want to
  10. give more weight to documents with dates closer to a certain date,
  11. or to documents with locations closer to a certain location.
  12. This query is called `distance_feature` query, because it dynamically
  13. calculates distances between the given origin and documents' field values,
  14. and use these distances as features to boost the documents' scores.
  15. `distance_feature` query is typically used on its own to find the nearest
  16. neighbors to a given point, or put in a `should` clause of a
  17. <<query-dsl-bool-query,`bool`>> query so that its score is added to the score
  18. of the query.
  19. Compared to using <<query-dsl-function-score-query,`function_score`>> or other
  20. ways to modify the score, this query has the benefit of being able to
  21. efficiently skip non-competitive hits when
  22. <<search-uri-request,`track_total_hits`>> is not set to `true`.
  23. ==== Syntax of distance_feature query
  24. `distance_feature` query has the following syntax:
  25. [source,js]
  26. --------------------------------------------------
  27. "distance_feature": {
  28. "field": <field>,
  29. "origin": <origin>,
  30. "pivot": <pivot>,
  31. "boost" : <boost>
  32. }
  33. --------------------------------------------------
  34. // NOTCONSOLE
  35. [horizontal]
  36. `field`::
  37. Required parameter. Defines the name of the field on which to calculate
  38. distances. Must be a field of the type `date`, `date_nanos` or `geo_point`,
  39. and must be indexed (`"index": true`, which is the default) and has
  40. <<doc-values, doc values>> (`"doc_values": true`, which is the default).
  41. `origin`::
  42. Required parameter. Defines a point of origin used for calculating
  43. distances. Must be a date for date and date_nanos fields,
  44. and a geo-point for geo_point fields. Date math (for example `now-1h`) is
  45. supported for a date origin.
  46. `pivot`::
  47. Required parameter. Defines the distance from origin at which the computed
  48. score will equal to a half of the `boost` parameter. Must be
  49. a `number+date unit` ("1h", "10d",...) for date and date_nanos fields,
  50. and a `number + geo unit` ("1km", "12m",...) for geo fields.
  51. `boost`::
  52. Optional parameter with a default value of `1`. Defines the factor by which
  53. to multiply the score. Must be a non-negative float number.
  54. The `distance_feature` query computes a document's score as following:
  55. `score = boost * pivot / (pivot + distance)`
  56. where `distance` is the absolute difference between the origin and
  57. a document's field value.
  58. ==== Example using distance_feature query
  59. Let's look at an example. We index several documents containing
  60. information about sales items, such as name, production date,
  61. and location.
  62. [source,js]
  63. --------------------------------------------------
  64. PUT items
  65. {
  66. "mappings": {
  67. "properties": {
  68. "name": {
  69. "type": "keyword"
  70. },
  71. "production_date": {
  72. "type": "date"
  73. },
  74. "location": {
  75. "type": "geo_point"
  76. }
  77. }
  78. }
  79. }
  80. PUT items/_doc/1
  81. {
  82. "name" : "chocolate",
  83. "production_date": "2018-02-01",
  84. "location": [-71.34, 41.12]
  85. }
  86. PUT items/_doc/2
  87. {
  88. "name" : "chocolate",
  89. "production_date": "2018-01-01",
  90. "location": [-71.3, 41.15]
  91. }
  92. PUT items/_doc/3
  93. {
  94. "name" : "chocolate",
  95. "production_date": "2017-12-01",
  96. "location": [-71.3, 41.12]
  97. }
  98. POST items/_refresh
  99. --------------------------------------------------
  100. // CONSOLE
  101. We look for all chocolate items, but we also want chocolates
  102. that are produced recently (closer to the date `now`)
  103. to be ranked higher.
  104. [source,js]
  105. --------------------------------------------------
  106. GET items/_search
  107. {
  108. "query": {
  109. "bool": {
  110. "must": {
  111. "match": {
  112. "name": "chocolate"
  113. }
  114. },
  115. "should": {
  116. "distance_feature": {
  117. "field": "production_date",
  118. "pivot": "7d",
  119. "origin": "now"
  120. }
  121. }
  122. }
  123. }
  124. }
  125. --------------------------------------------------
  126. // CONSOLE
  127. // TEST[continued]
  128. We can look for all chocolate items, but we also want chocolates
  129. that are produced locally (closer to our geo origin)
  130. come first in the result list.
  131. [source,js]
  132. --------------------------------------------------
  133. GET items/_search
  134. {
  135. "query": {
  136. "bool": {
  137. "must": {
  138. "match": {
  139. "name": "chocolate"
  140. }
  141. },
  142. "should": {
  143. "distance_feature": {
  144. "field": "location",
  145. "pivot": "1000m",
  146. "origin": [-71.3, 41.15]
  147. }
  148. }
  149. }
  150. }
  151. }
  152. --------------------------------------------------
  153. // CONSOLE
  154. // TEST[continued]