distance-feature-query.asciidoc 4.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177
  1. [[query-dsl-distance-feature-query]]
  2. === Distance Feature Query
  3. The `distance_feature` query is a specialized query that only works
  4. on <<date, `date`>>, <<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>>
  5. fields. Its goal is to boost documents' scores based on proximity
  6. to some given origin. For example, use this query if you want to
  7. give more weight to documents with dates closer to a certain date,
  8. or to documents with locations closer to a certain location.
  9. This query is called `distance_feature` query, because it dynamically
  10. calculates distances between the given origin and documents' field values,
  11. and use these distances as features to boost the documents' scores.
  12. `distance_feature` query is typically used on its own to find the nearest
  13. neighbors to a given point, or put in a `should` clause of a
  14. <<query-dsl-bool-query,`bool`>> query so that its score is added to the score
  15. of the query.
  16. Compared to using <<query-dsl-function-score-query,`function_score`>> or other
  17. ways to modify the score, this query has the benefit of being able to
  18. efficiently skip non-competitive hits when
  19. <<search-uri-request,`track_total_hits`>> is not set to `true`.
  20. ==== Syntax of distance_feature query
  21. `distance_feature` query has the following syntax:
  22. [source,js]
  23. --------------------------------------------------
  24. "distance_feature": {
  25. "field": <field>,
  26. "origin": <origin>,
  27. "pivot": <pivot>,
  28. "boost" : <boost>
  29. }
  30. --------------------------------------------------
  31. // NOTCONSOLE
  32. [horizontal]
  33. `field`::
  34. Required parameter. Defines the name of the field on which to calculate
  35. distances. Must be a field of the type `date`, `date_nanos` or `geo_point`,
  36. and must be indexed (`"index": true`, which is the default) and has
  37. <<doc-values, doc values>> (`"doc_values": true`, which is the default).
  38. `origin`::
  39. Required parameter. Defines a point of origin used for calculating
  40. distances. Must be a date for date and date_nanos fields,
  41. and a geo-point for geo_point fields. Date math (for example `now-1h`) is
  42. supported for a date origin.
  43. `pivot`::
  44. Required parameter. Defines the distance from origin at which the computed
  45. score will equal to a half of the `boost` parameter. Must be
  46. a `number+date unit` ("1h", "10d",...) for date and date_nanos fields,
  47. and a `number + geo unit` ("1km", "12m",...) for geo fields.
  48. `boost`::
  49. Optional parameter with a default value of `1`. Defines the factor by which
  50. to multiply the score. Must be a non-negative float number.
  51. The `distance_feature` query computes a document's score as following:
  52. `score = boost * pivot / (pivot + distance)`
  53. where `distance` is the absolute difference between the origin and
  54. a document's field value.
  55. ==== Example using distance_feature query
  56. Let's look at an example. We index several documents containing
  57. information about sales items, such as name, production date,
  58. and location.
  59. [source,js]
  60. --------------------------------------------------
  61. PUT items
  62. {
  63. "mappings": {
  64. "properties": {
  65. "name": {
  66. "type": "keyword"
  67. },
  68. "production_date": {
  69. "type": "date"
  70. },
  71. "location": {
  72. "type": "geo_point"
  73. }
  74. }
  75. }
  76. }
  77. PUT items/_doc/1
  78. {
  79. "name" : "chocolate",
  80. "production_date": "2018-02-01",
  81. "location": [-71.34, 41.12]
  82. }
  83. PUT items/_doc/2
  84. {
  85. "name" : "chocolate",
  86. "production_date": "2018-01-01",
  87. "location": [-71.3, 41.15]
  88. }
  89. PUT items/_doc/3
  90. {
  91. "name" : "chocolate",
  92. "production_date": "2017-12-01",
  93. "location": [-71.3, 41.12]
  94. }
  95. POST items/_refresh
  96. --------------------------------------------------
  97. // CONSOLE
  98. We look for all chocolate items, but we also want chocolates
  99. that are produced recently (closer to the date `now`)
  100. to be ranked higher.
  101. [source,js]
  102. --------------------------------------------------
  103. GET items/_search
  104. {
  105. "query": {
  106. "bool": {
  107. "must": {
  108. "match": {
  109. "name": "chocolate"
  110. }
  111. },
  112. "should": {
  113. "distance_feature": {
  114. "field": "production_date",
  115. "pivot": "7d",
  116. "origin": "now"
  117. }
  118. }
  119. }
  120. }
  121. }
  122. --------------------------------------------------
  123. // CONSOLE
  124. // TEST[continued]
  125. We can look for all chocolate items, but we also want chocolates
  126. that are produced locally (closer to our geo origin)
  127. come first in the result list.
  128. [source,js]
  129. --------------------------------------------------
  130. GET items/_search
  131. {
  132. "query": {
  133. "bool": {
  134. "must": {
  135. "match": {
  136. "name": "chocolate"
  137. }
  138. },
  139. "should": {
  140. "distance_feature": {
  141. "field": "location",
  142. "pivot": "1000m",
  143. "origin": [-71.3, 41.15]
  144. }
  145. }
  146. }
  147. }
  148. }
  149. --------------------------------------------------
  150. // CONSOLE
  151. // TEST[continued]