123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177 |
- [[query-dsl-distance-feature-query]]
- === Distance Feature Query
- The `distance_feature` query is a specialized query that only works
- on <<date, `date`>>, <<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>>
- fields. Its goal is to boost documents' scores based on proximity
- to some given origin. For example, use this query if you want to
- give more weight to documents with dates closer to a certain date,
- or to documents with locations closer to a certain location.
- This query is called `distance_feature` query, because it dynamically
- calculates distances between the given origin and documents' field values,
- and use these distances as features to boost the documents' scores.
- `distance_feature` query is typically used on its own to find the nearest
- neighbors to a given point, or put in a `should` clause of a
- <<query-dsl-bool-query,`bool`>> query so that its score is added to the score
- of the query.
- Compared to using <<query-dsl-function-score-query,`function_score`>> or other
- ways to modify the score, this query has the benefit of being able to
- efficiently skip non-competitive hits when
- <<search-uri-request,`track_total_hits`>> is not set to `true`.
- ==== Syntax of distance_feature query
- `distance_feature` query has the following syntax:
- [source,js]
- --------------------------------------------------
- "distance_feature": {
- "field": <field>,
- "origin": <origin>,
- "pivot": <pivot>,
- "boost" : <boost>
- }
- --------------------------------------------------
- // NOTCONSOLE
- [horizontal]
- `field`::
- Required parameter. Defines the name of the field on which to calculate
- distances. Must be a field of the type `date`, `date_nanos` or `geo_point`,
- and must be indexed (`"index": true`, which is the default) and has
- <<doc-values, doc values>> (`"doc_values": true`, which is the default).
- `origin`::
- Required parameter. Defines a point of origin used for calculating
- distances. Must be a date for date and date_nanos fields,
- and a geo-point for geo_point fields. Date math (for example `now-1h`) is
- supported for a date origin.
- `pivot`::
- Required parameter. Defines the distance from origin at which the computed
- score will equal to a half of the `boost` parameter. Must be
- a `number+date unit` ("1h", "10d",...) for date and date_nanos fields,
- and a `number + geo unit` ("1km", "12m",...) for geo fields.
- `boost`::
- Optional parameter with a default value of `1`. Defines the factor by which
- to multiply the score. Must be a non-negative float number.
- The `distance_feature` query computes a document's score as following:
- `score = boost * pivot / (pivot + distance)`
- where `distance` is the absolute difference between the origin and
- a document's field value.
- ==== Example using distance_feature query
- Let's look at an example. We index several documents containing
- information about sales items, such as name, production date,
- and location.
- [source,js]
- --------------------------------------------------
- PUT items
- {
- "mappings": {
- "properties": {
- "name": {
- "type": "keyword"
- },
- "production_date": {
- "type": "date"
- },
- "location": {
- "type": "geo_point"
- }
- }
- }
- }
- PUT items/_doc/1
- {
- "name" : "chocolate",
- "production_date": "2018-02-01",
- "location": [-71.34, 41.12]
- }
- PUT items/_doc/2
- {
- "name" : "chocolate",
- "production_date": "2018-01-01",
- "location": [-71.3, 41.15]
- }
- PUT items/_doc/3
- {
- "name" : "chocolate",
- "production_date": "2017-12-01",
- "location": [-71.3, 41.12]
- }
- POST items/_refresh
- --------------------------------------------------
- // CONSOLE
- We look for all chocolate items, but we also want chocolates
- that are produced recently (closer to the date `now`)
- to be ranked higher.
- [source,js]
- --------------------------------------------------
- GET items/_search
- {
- "query": {
- "bool": {
- "must": {
- "match": {
- "name": "chocolate"
- }
- },
- "should": {
- "distance_feature": {
- "field": "production_date",
- "pivot": "7d",
- "origin": "now"
- }
- }
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
- We can look for all chocolate items, but we also want chocolates
- that are produced locally (closer to our geo origin)
- come first in the result list.
- [source,js]
- --------------------------------------------------
- GET items/_search
- {
- "query": {
- "bool": {
- "must": {
- "match": {
- "name": "chocolate"
- }
- },
- "should": {
- "distance_feature": {
- "field": "location",
- "pivot": "1000m",
- "origin": [-71.3, 41.15]
- }
- }
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
|