123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419 |
- [[query-dsl-script-score-query]]
- === Script Score Query
- The `script_score` allows you to modify the score of documents that are
- retrieved by a query. This can be useful if, for example, a score
- function is computationally expensive and it is sufficient to compute
- the score on a filtered set of documents.
- To use `script_score`, you have to define a query and a script -
- a function to be used to compute a new score for each document returned
- by the query. For more information on scripting see
- <<modules-scripting, scripting documentation>>.
- Here is an example of using `script_score` to assign each matched document
- a score equal to the number of likes divided by 10:
- [source,js]
- --------------------------------------------------
- GET /_search
- {
- "query" : {
- "script_score" : {
- "query" : {
- "match": { "message": "elasticsearch" }
- },
- "script" : {
- "source" : "doc['likes'].value / 10 "
- }
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[setup:twitter]
- NOTE: The values returned from `script_score` cannot be negative. In general,
- Lucene requires the scores produced by queries to be non-negative in order to
- support certain search optimizations.
- ==== Accessing the score of a document within a script
- Within a script, you can
- {ref}/modules-scripting-fields.html#scripting-score[access]
- the `_score` variable which represents the current relevance score of a
- document.
- ==== Predefined functions within a Painless script
- You can use any of the available
- <<painless-api-reference, painless functions>> in the painless script.
- Besides these functions, there are a number of predefined functions
- that can help you with scoring. We suggest you to use them instead of
- rewriting equivalent functions of your own, as these functions try
- to be the most efficient by using the internal mechanisms.
- ===== saturation
- `saturation(value,k) = value/(k + value)`
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "saturation(doc['likes'].value, 1)"
- }
- --------------------------------------------------
- // NOTCONSOLE
- ===== sigmoid
- `sigmoid(value, k, a) = value^a/ (k^a + value^a)`
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "sigmoid(doc['likes'].value, 2, 1)"
- }
- --------------------------------------------------
- // NOTCONSOLE
- [role="xpack"]
- [testenv="basic"]
- [[vector-functions]]
- ===== Functions for vector fields
- experimental[]
- These functions are used for
- for <<dense-vector,`dense_vector`>> and
- <<sparse-vector,`sparse_vector`>> fields.
- NOTE: During vector functions' calculation, all matched documents are
- linearly scanned. Thus, expect the query time grow linearly
- with the number of matched documents. For this reason, we recommend
- to limit the number of matched documents with a `query` parameter.
- For dense_vector fields, `cosineSimilarity` calculates the measure of
- cosine similarity between a given query vector and document vectors.
- [source,js]
- --------------------------------------------------
- {
- "query": {
- "script_score": {
- "query": {
- "match_all": {}
- },
- "script": {
- "source": "cosineSimilarity(params.query_vector, doc['my_dense_vector']) + 1.0", <1>
- "params": {
- "query_vector": [4, 3.4, -0.2] <2>
- }
- }
- }
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- <1> The script adds 1.0 to the cosine similarity to prevent the score from being negative.
- <2> To take advantage of the script optimizations, provide a query vector as a script parameter.
- Similarly, for sparse_vector fields, `cosineSimilaritySparse` calculates cosine similarity
- between a given query vector and document vectors.
- [source,js]
- --------------------------------------------------
- {
- "query": {
- "script_score": {
- "query": {
- "match_all": {}
- },
- "script": {
- "source": "cosineSimilaritySparse(params.query_vector, doc['my_sparse_vector']) + 1.0",
- "params": {
- "query_vector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
- }
- }
- }
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- For dense_vector fields, `dotProduct` calculates the measure of
- dot product between a given query vector and document vectors.
- [source,js]
- --------------------------------------------------
- {
- "query": {
- "script_score": {
- "query": {
- "match_all": {}
- },
- "script": {
- "source": """
- double value = dotProduct(params.query_vector, doc['my_vector']);
- return sigmoid(1, Math.E, -value); <1>
- """,
- "params": {
- "query_vector": [4, 3.4, -0.2]
- }
- }
- }
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- <1> Using the standard sigmoid function prevents scores from being negative.
- Similarly, for sparse_vector fields, `dotProductSparse` calculates dot product
- between a given query vector and document vectors.
- [source,js]
- --------------------------------------------------
- {
- "query": {
- "script_score": {
- "query": {
- "match_all": {}
- },
- "script": {
- "source": """
- double value = dotProductSparse(params.query_vector, doc['my_sparse_vector']);
- return sigmoid(1, Math.E, -value);
- """,
- "params": {
- "query_vector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
- }
- }
- }
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- NOTE: If a document doesn't have a value for a vector field on which
- a vector function is executed, 0 is returned as a result
- for this document.
- NOTE: If a document's dense vector field has a number of dimensions
- different from the query's vector, 0 is used for missing dimensions
- in the calculations of vector functions.
- [[random-score-function]]
- ===== Random score function
- `random_score` function generates scores that are uniformly distributed
- from 0 up to but not including 1.
- `randomScore` function has the following syntax:
- `randomScore(<seed>, <fieldName>)`.
- It has a required parameter - `seed` as an integer value,
- and an optional parameter - `fieldName` as a string value.
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "randomScore(100, '_seq_no')"
- }
- --------------------------------------------------
- // NOTCONSOLE
- If the `fieldName` parameter is omitted, the internal Lucene
- document ids will be used as a source of randomness. This is very efficient,
- but unfortunately not reproducible since documents might be renumbered
- by merges.
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "randomScore(100)"
- }
- --------------------------------------------------
- // NOTCONSOLE
- Note that documents that are within the same shard and have the
- same value for field will get the same score, so it is usually desirable
- to use a field that has unique values for all documents across a shard.
- A good default choice might be to use the `_seq_no`
- field, whose only drawback is that scores will change if the document is
- updated since update operations also update the value of the `_seq_no` field.
- [[decay-functions-numeric-fields]]
- ===== Decay functions for numeric fields
- You can read more about decay functions
- {ref}/query-dsl-function-score-query.html#function-decay[here].
- * `double decayNumericLinear(double origin, double scale, double offset, double decay, double docValue)`
- * `double decayNumericExp(double origin, double scale, double offset, double decay, double docValue)`
- * `double decayNumericGauss(double origin, double scale, double offset, double decay, double docValue)`
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "decayNumericLinear(params.origin, params.scale, params.offset, params.decay, doc['dval'].value)",
- "params": { <1>
- "origin": 20,
- "scale": 10,
- "decay" : 0.5,
- "offset" : 0
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- <1> Using `params` allows to compile the script only once, even if params change.
- ===== Decay functions for geo fields
- * `double decayGeoLinear(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
- * `double decayGeoExp(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
- * `double decayGeoGauss(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "decayGeoExp(params.origin, params.scale, params.offset, params.decay, doc['location'].value)",
- "params": {
- "origin": "40, -70.12",
- "scale": "200km",
- "offset": "0km",
- "decay" : 0.2
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- ===== Decay functions for date fields
- * `double decayDateLinear(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
- * `double decayDateExp(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
- * `double decayDateGauss(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "decayDateGauss(params.origin, params.scale, params.offset, params.decay, doc['date'].value)",
- "params": {
- "origin": "2008-01-01T01:00:00Z",
- "scale": "1h",
- "offset" : "0",
- "decay" : 0.5
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- NOTE: Decay functions on dates are limited to dates in the default format
- and default time zone. Also calculations with `now` are not supported.
- ==== Faster alternatives
- Script Score Query calculates the score for every hit (matching document).
- There are faster alternative query types that can efficiently skip
- non-competitive hits:
- * If you want to boost documents on some static fields, use
- <<query-dsl-rank-feature-query, Rank Feature Query>>.
- ==== Transition from Function Score Query
- We are deprecating <<query-dsl-function-score-query, Function Score>>, and
- Script Score Query will be a substitute for it.
- Here we describe how Function Score Query's functions can be
- equivalently implemented in Script Score Query:
- [[script-score]]
- ===== `script_score`
- What you used in `script_score` of the Function Score query, you
- can copy into the Script Score query. No changes here.
- [[weight]]
- ===== `weight`
- `weight` function can be implemented in the Script Score query through
- the following script:
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "params.weight * _score",
- "params": {
- "weight": 2
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- [[random-score]]
- ===== `random_score`
- Use `randomScore` function
- as described in <<random-score-function, random score function>>.
- [[field-value-factor]]
- ===== `field_value_factor`
- `field_value_factor` function can be easily implemented through script:
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "Math.log10(doc['field'].value * params.factor)",
- params" : {
- "factor" : 5
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- For checking if a document has a missing value, you can use
- `doc['field'].size() == 0`. For example, this script will use
- a value `1` if a document doesn't have a field `field`:
- [source,js]
- --------------------------------------------------
- "script" : {
- "source" : "Math.log10((doc['field'].size() == 0 ? 1 : doc['field'].value()) * params.factor)",
- params" : {
- "factor" : 5
- }
- }
- --------------------------------------------------
- // NOTCONSOLE
- This table lists how `field_value_factor` modifiers can be implemented
- through a script:
- [cols="<,<",options="header",]
- |=======================================================================
- | Modifier | Implementation in Script Score
- | `none` | -
- | `log` | `Math.log10(doc['f'].value)`
- | `log1p` | `Math.log10(doc['f'].value + 1)`
- | `log2p` | `Math.log10(doc['f'].value + 2)`
- | `ln` | `Math.log(doc['f'].value)`
- | `ln1p` | `Math.log(doc['f'].value + 1)`
- | `ln2p` | `Math.log(doc['f'].value + 2)`
- | `square` | `Math.pow(doc['f'].value, 2)`
- | `sqrt` | `Math.sqrt(doc['f'].value)`
- | `reciprocal` | `1.0 / doc['f'].value`
- |=======================================================================
- [[decay-functions]]
- ===== `decay functions`
- Script Score query has equivalent <<decay-functions, decay functions>>
- that can be used in script.
|