123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225 |
- [[query-dsl-distance-feature-query]]
- === Distance feature query
- ++++
- <titleabbrev>Distance feature</titleabbrev>
- ++++
- Boosts the <<relevance-scores,relevance score>> of documents closer to a
- provided `origin` date or point. For example, you can use this query to give
- more weight to documents closer to a certain date or location.
- You can use the `distance_feature` query to find the nearest neighbors to a
- location. You can also use the query in a <<query-dsl-bool-query,`bool`>>
- search's `should` filter to add boosted relevance scores to the `bool` query's
- scores.
- [[distance-feature-query-ex-request]]
- ==== Example request
- [[distance-feature-index-setup]]
- ===== Index setup
- To use the `distance_feature` query, your index must include a <<date, `date`>>,
- <<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>> field.
- To see how you can set up an index for the `distance_feature` query, try the
- following example.
- . Create an `items` index with the following field mapping:
- +
- --
- * `name`, a <<keyword,`keyword`>> field
- * `production_date`, a <<date, `date`>> field
- * `location`, a <<geo-point,`geo_point`>> field
- [source,console]
- ----
- PUT /items
- {
- "mappings": {
- "properties": {
- "name": {
- "type": "keyword"
- },
- "production_date": {
- "type": "date"
- },
- "location": {
- "type": "geo_point"
- }
- }
- }
- }
- ----
- // TESTSETUP
- --
- . Index several documents to this index.
- +
- --
- [source,console]
- ----
- PUT /items/_doc/1?refresh
- {
- "name" : "chocolate",
- "production_date": "2018-02-01",
- "location": [-71.34, 41.12]
- }
- PUT /items/_doc/2?refresh
- {
- "name" : "chocolate",
- "production_date": "2018-01-01",
- "location": [-71.3, 41.15]
- }
- PUT /items/_doc/3?refresh
- {
- "name" : "chocolate",
- "production_date": "2017-12-01",
- "location": [-71.3, 41.12]
- }
- ----
- --
- [[distance-feature-query-ex-query]]
- ===== Example queries
- [[distance-feature-query-date-ex]]
- ====== Boost documents based on date
- The following `bool` search returns documents with a `name` value of
- `chocolate`. The search also uses the `distance_feature` query to increase the
- relevance score of documents with a `production_date` value closer to `now`.
- [source,console]
- ----
- GET /items/_search
- {
- "query": {
- "bool": {
- "must": {
- "match": {
- "name": "chocolate"
- }
- },
- "should": {
- "distance_feature": {
- "field": "production_date",
- "pivot": "7d",
- "origin": "now"
- }
- }
- }
- }
- }
- ----
- [[distance-feature-query-distance-ex]]
- ====== Boost documents based on location
- The following `bool` search returns documents with a `name` value of
- `chocolate`. The search also uses the `distance_feature` query to increase the
- relevance score of documents with a `location` value closer to `[-71.3, 41.15]`.
- [source,console]
- ----
- GET /items/_search
- {
- "query": {
- "bool": {
- "must": {
- "match": {
- "name": "chocolate"
- }
- },
- "should": {
- "distance_feature": {
- "field": "location",
- "pivot": "1000m",
- "origin": [-71.3, 41.15]
- }
- }
- }
- }
- }
- ----
- [[distance-feature-top-level-params]]
- ==== Top-level parameters for `distance_feature`
- `field`::
- (Required, string) Name of the field used to calculate distances. This field
- must meet the following criteria:
- * Be a <<date, `date`>>, <<date_nanos, `date_nanos`>> or
- <<geo-point,`geo_point`>> field
- * Have an <<mapping-index,`index`>> mapping parameter value of `true`, which is
- the default
- * Have an <<doc-values,`doc_values`>> mapping parameter value of `true`, which
- is the default
- `origin`::
- +
- --
- (Required, string) Date or point of origin used to calculate distances.
- If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
- field, the `origin` value must be a <<date-format-pattern,date>>.
- <<date-math,Date Math>>, such as `now-1h`, is supported.
- If the `field` value is a <<geo-point,`geo_point`>> field, the `origin` value
- must be a geopoint.
- --
- `pivot`::
- +
- --
- (Required, <<time-units,time unit>> or <<distance-units,distance unit>>)
- Distance from the `origin` at which relevance scores receive half of the `boost`
- value.
- If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
- field, the `pivot` value must be a <<time-units,time unit>>, such as `1h` or
- `10d`.
- If the `field` value is a <<geo-point,`geo_point`>> field, the `pivot` value
- must be a <<distance-units,distance unit>>, such as `1km` or `12m`.
- --
- `boost`::
- +
- --
- (Optional, float) Floating point number used to multiply the
- <<relevance-scores,relevance score>> of matching documents. This value
- cannot be negative. Defaults to `1.0`.
- --
- [[distance-feature-notes]]
- ==== Notes
- [[distance-feature-calculation]]
- ===== How the `distance_feature` query calculates relevance scores
- The `distance_feature` query dynamically calculates the distance between the
- `origin` value and a document's field values. It then uses this distance as a
- feature to boost the <<relevance-scores,relevance score>> of closer
- documents.
- The `distance_feature` query calculates a document's
- <<relevance-scores,relevance score>> as follows:
- ```
- relevance score = boost * pivot / (pivot + distance)
- ```
- The `distance` is the absolute difference between the `origin` value and a
- document's field value.
- [[distance-feature-skip-hits]]
- ===== Skip non-competitive hits
- Unlike the <<query-dsl-function-score-query,`function_score`>> query or other
- ways to change <<relevance-scores,relevance scores>>, the
- `distance_feature` query efficiently skips non-competitive hits when the
- <<search-uri-request,`track_total_hits`>> parameter is **not** `true`.
|