| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313 | [[query-dsl-rank-feature-query]]=== Rank feature query++++<titleabbrev>Rank feature</titleabbrev>++++Boosts the <<relevance-scores,relevance score>> of documents based on thenumeric value of a <<rank-feature,`rank_feature`>> or<<rank-features,`rank_features`>> field.The `rank_feature` query is typically used in the `should` clause of a<<query-dsl-bool-query,`bool`>> query so its relevance scores are added to otherscores from the `bool` query.Unlike the <<query-dsl-function-score-query,`function_score`>> query or otherways to change <<relevance-scores,relevance scores>>, the`rank_feature` query efficiently skips non-competitive hits when the<<search-uri-request,`track_total_hits`>> parameter is **not** `true`. This candramatically improve query speed.[[rank-feature-query-functions]]==== Rank feature functionsTo calculate relevance scores based on rank feature fields, the `rank_feature`query supports the following mathematical functions:* <<rank-feature-query-saturation,Saturation>>* <<rank-feature-query-logarithm,Logarithm>>* <<rank-feature-query-sigmoid,Sigmoid>>If you don't know where to start, we recommend using the `saturation` function.If no function is provided, the `rank_feature` query uses the `saturation`function by default.[[rank-feature-query-ex-request]]==== Example request[[rank-feature-query-index-setup]]===== Index setupTo use the `rank_feature` query, your index must include a<<rank-feature,`rank_feature`>> or <<rank-features,`rank_features`>> fieldmapping. To see how you can set up an index for the `rank_feature` query, trythe following example.Create a `test` index with the following field mappings:- `pagerank`, a <<rank-feature,`rank_feature`>> field which measures theimportance of a website- `url_length`, a <<rank-feature,`rank_feature`>> field which contains thelength of the website's URL. For this example, a long URL correlates negativelyto relevance, indicated by a `positive_score_impact` value of `false`.- `topics`, a <<rank-features,`rank_features`>> field which contains a list oftopics and a measure of how well each document is connected to this topic[source,console]----PUT /test{  "mappings": {    "properties": {      "pagerank": {        "type": "rank_feature"      },      "url_length": {        "type": "rank_feature",        "positive_score_impact": false      },      "topics": {        "type": "rank_features"      }    }  }}----// TESTSETUPIndex several documents to the `test` index.[source,console]----PUT /test/_doc/1?refresh{  "url": "https://en.wikipedia.org/wiki/2016_Summer_Olympics",  "content": "Rio 2016",  "pagerank": 50.3,  "url_length": 42,  "topics": {    "sports": 50,    "brazil": 30  }}PUT /test/_doc/2?refresh{  "url": "https://en.wikipedia.org/wiki/2016_Brazilian_Grand_Prix",  "content": "Formula One motor race held on 13 November 2016",  "pagerank": 50.3,  "url_length": 47,  "topics": {    "sports": 35,    "formula one": 65,    "brazil": 20  }}PUT /test/_doc/3?refresh{  "url": "https://en.wikipedia.org/wiki/Deadpool_(film)",  "content": "Deadpool is a 2016 American superhero film",  "pagerank": 50.3,  "url_length": 37,  "topics": {    "movies": 60,    "super hero": 65  }}----[[rank-feature-query-ex-query]]===== Example queryThe following query searches for `2016` and boosts relevance scores based on`pagerank`, `url_length`, and the `sports` topic.[source,console]----GET /test/_search {  "query": {    "bool": {      "must": [        {          "match": {            "content": "2016"          }        }      ],      "should": [        {          "rank_feature": {            "field": "pagerank"          }        },        {          "rank_feature": {            "field": "url_length",            "boost": 0.1          }        },        {          "rank_feature": {            "field": "topics.sports",            "boost": 0.4          }        }      ]    }  }}----[[rank-feature-top-level-params]]==== Top-level parameters for `rank_feature``field`::(Required, string) <<rank-feature,`rank_feature`>> or<<rank-features,`rank_features`>> field used to boost<<relevance-scores,relevance scores>>.`boost`::+--(Optional, float) Floating point number used to decrease or increase<<relevance-scores,relevance scores>>. Defaults to `1.0`.Boost values are relative to the default value of `1.0`. A boost value between`0` and `1.0` decreases the relevance score. A value greater than `1.0`increases the relevance score.--`saturation`::+--(Optional, <<rank-feature-query-saturation,function object>>) Saturationfunction used to boost <<relevance-scores,relevance scores>> based on thevalue of the rank feature `field`. If no function is provided, the `rank_feature`query defaults to the `saturation` function. See<<rank-feature-query-saturation,Saturation>> for more information.Only one function `saturation`, `log`, or `sigmoid` can be provided.--`log`::+--(Optional, <<rank-feature-query-logarithm,function object>>) Logarithmicfunction used to boost <<relevance-scores,relevance scores>> based on thevalue of the rank feature `field`. See<<rank-feature-query-logarithm,Logarithm>> for more information.Only one function `saturation`, `log`, or `sigmoid` can be provided.--`sigmoid`::+--(Optional, <<rank-feature-query-sigmoid,function object>>) Sigmoid function usedto boost <<relevance-scores,relevance scores>> based on the value of therank feature `field`. See <<rank-feature-query-sigmoid,Sigmoid>> for moreinformation.Only one function `saturation`, `log`, or `sigmoid` can be provided.--[[rank-feature-query-notes]]==== Notes[[rank-feature-query-saturation]]===== SaturationThe `saturation` function gives a score equal to `S / (S + pivot)`, where `S` isthe value of the rank feature field and `pivot` is a configurable pivot value sothat the result will be less than `0.5` if `S` is less than pivot and greaterthan `0.5` otherwise. Scores are always `(0,1)`.If the rank feature has a negative score impact then the function will becomputed as `pivot / (S + pivot)`, which decreases when `S` increases.[source,console]--------------------------------------------------GET /test/_search{  "query": {    "rank_feature": {      "field": "pagerank",      "saturation": {        "pivot": 8      }    }  }}--------------------------------------------------If a `pivot` value is not provided, {es} computes a default value equal to theapproximate geometric mean of all rank feature values in the index. We recommendusing this default value if you haven't had the opportunity to train a goodpivot value.[source,console]--------------------------------------------------GET /test/_search{  "query": {    "rank_feature": {      "field": "pagerank",      "saturation": {}    }  }}--------------------------------------------------[[rank-feature-query-logarithm]]===== LogarithmThe `log` function gives a score equal to `log(scaling_factor + S)`, where `S`is the value of the rank feature field and `scaling_factor` is a configurablescaling factor. Scores are unbounded.This function only supports rank features that have a positive score impact.[source,console]--------------------------------------------------GET /test/_search{  "query": {    "rank_feature": {      "field": "pagerank",      "log": {        "scaling_factor": 4      }    }  }}--------------------------------------------------[[rank-feature-query-sigmoid]]===== SigmoidThe `sigmoid` function is an extension of `saturation` which adds a configurableexponent. Scores are computed as `S^exp^ / (S^exp^ + pivot^exp^)`. Like for the`saturation` function, `pivot` is the value of `S` that gives a score of `0.5`and scores are `(0,1)`.The `exponent` must be positive and is typically in `[0.5, 1]`. Agood value should be computed via training. If you don't have the opportunity todo so, we recommend you use the `saturation` function instead.[source,console]--------------------------------------------------GET /test/_search{  "query": {    "rank_feature": {      "field": "pagerank",      "sigmoid": {        "pivot": 7,        "exponent": 0.6      }    }  }}--------------------------------------------------
 |