script-score-query.asciidoc 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376
  1. [[query-dsl-script-score-query]]
  2. === Script score query
  3. ++++
  4. <titleabbrev>Script score</titleabbrev>
  5. ++++
  6. Uses a <<modules-scripting,script>> to provide a custom score for returned
  7. documents.
  8. The `script_score` query is useful if, for example, a scoring function is expensive and you only need to calculate the score of a filtered set of documents.
  9. [[script-score-query-ex-request]]
  10. ==== Example request
  11. The following `script_score` query assigns each returned document a score equal to the `my-int` field value divided by `10`.
  12. [source,console]
  13. --------------------------------------------------
  14. GET /_search
  15. {
  16. "query": {
  17. "script_score": {
  18. "query": {
  19. "match": { "message": "elasticsearch" }
  20. },
  21. "script": {
  22. "source": "doc['my-int'].value / 10 "
  23. }
  24. }
  25. }
  26. }
  27. --------------------------------------------------
  28. [[script-score-top-level-params]]
  29. ==== Top-level parameters for `script_score`
  30. `query`::
  31. (Required, query object) Query used to return documents.
  32. `script`::
  33. +
  34. --
  35. (Required, <<modules-scripting-using,script object>>) Script used to compute the score of documents returned by the `query`.
  36. IMPORTANT: Final relevance scores from the `script_score` query cannot be
  37. negative. To support certain search optimizations, Lucene requires
  38. scores be positive or `0`.
  39. --
  40. `min_score`::
  41. (Optional, float) Documents with a score lower
  42. than this floating point number are excluded from the search results.
  43. `boost`::
  44. (Optional, float) Documents' scores produced by `script` are
  45. multiplied by `boost` to produce final documents' scores. Defaults to `1.0`.
  46. [[script-score-query-notes]]
  47. ==== Notes
  48. [[script-score-access-scores]]
  49. ===== Use relevance scores in a script
  50. Within a script, you can
  51. {ref}/modules-scripting-fields.html#scripting-score[access]
  52. the `_score` variable which represents the current relevance score of a
  53. document.
  54. [[script-score-access-term-statistics]]
  55. ===== Use term statistics in a script
  56. Within a script, you can
  57. {ref}/modules-scripting-fields.html#scripting-term-statistics[access]
  58. the `_termStats` variable which provides statistical information about the terms used in the child query of the `script_score` query.
  59. [[script-score-predefined-functions]]
  60. ===== Predefined functions
  61. You can use any of the available {painless}/painless-contexts.html[painless
  62. functions] in your `script`. You can also use the following predefined functions
  63. to customize scoring:
  64. * <<script-score-saturation>>
  65. * <<script-score-sigmoid>>
  66. * <<random-score-function>>
  67. * <<decay-functions-numeric-fields>>
  68. * <<decay-functions-geo-fields>>
  69. * <<decay-functions-date-fields>>
  70. * <<script-score-functions-vector-fields>>
  71. We suggest using these predefined functions instead of writing your own.
  72. These functions take advantage of efficiencies from {es}' internal mechanisms.
  73. [[script-score-saturation]]
  74. ====== Saturation
  75. `saturation(value,k) = value/(k + value)`
  76. [source,js]
  77. --------------------------------------------------
  78. "script" : {
  79. "source" : "saturation(doc['my-int'].value, 1)"
  80. }
  81. --------------------------------------------------
  82. // NOTCONSOLE
  83. [[script-score-sigmoid]]
  84. ====== Sigmoid
  85. `sigmoid(value, k, a) = value^a/ (k^a + value^a)`
  86. [source,js]
  87. --------------------------------------------------
  88. "script" : {
  89. "source" : "sigmoid(doc['my-int'].value, 2, 1)"
  90. }
  91. --------------------------------------------------
  92. // NOTCONSOLE
  93. [[random-score-function]]
  94. ====== Random score function
  95. `random_score` function generates scores that are uniformly distributed
  96. from 0 up to but not including 1.
  97. `randomScore` function has the following syntax:
  98. `randomScore(<seed>, <fieldName>)`.
  99. It has a required parameter - `seed` as an integer value,
  100. and an optional parameter - `fieldName` as a string value.
  101. [source,js]
  102. --------------------------------------------------
  103. "script" : {
  104. "source" : "randomScore(100, '_seq_no')"
  105. }
  106. --------------------------------------------------
  107. // NOTCONSOLE
  108. If the `fieldName` parameter is omitted, the internal Lucene
  109. document ids will be used as a source of randomness. This is very efficient,
  110. but unfortunately not reproducible since documents might be renumbered
  111. by merges.
  112. [source,js]
  113. --------------------------------------------------
  114. "script" : {
  115. "source" : "randomScore(100)"
  116. }
  117. --------------------------------------------------
  118. // NOTCONSOLE
  119. Note that documents that are within the same shard and have the
  120. same value for field will get the same score, so it is usually desirable
  121. to use a field that has unique values for all documents across a shard.
  122. A good default choice might be to use the `_seq_no`
  123. field, whose only drawback is that scores will change if the document is
  124. updated since update operations also update the value of the `_seq_no` field.
  125. [[decay-functions-numeric-fields]]
  126. ====== Decay functions for numeric fields
  127. You can read more about decay functions
  128. {ref}/query-dsl-function-score-query.html#function-decay[here].
  129. * `double decayNumericLinear(double origin, double scale, double offset, double decay, double docValue)`
  130. * `double decayNumericExp(double origin, double scale, double offset, double decay, double docValue)`
  131. * `double decayNumericGauss(double origin, double scale, double offset, double decay, double docValue)`
  132. [source,js]
  133. --------------------------------------------------
  134. "script" : {
  135. "source" : "decayNumericLinear(params.origin, params.scale, params.offset, params.decay, doc['dval'].value)",
  136. "params": { <1>
  137. "origin": 20,
  138. "scale": 10,
  139. "decay" : 0.5,
  140. "offset" : 0
  141. }
  142. }
  143. --------------------------------------------------
  144. // NOTCONSOLE
  145. <1> Using `params` allows to compile the script only once, even if params change.
  146. [[decay-functions-geo-fields]]
  147. ====== Decay functions for geo fields
  148. * `double decayGeoLinear(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
  149. * `double decayGeoExp(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
  150. * `double decayGeoGauss(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
  151. [source,js]
  152. --------------------------------------------------
  153. "script" : {
  154. "source" : "decayGeoExp(params.origin, params.scale, params.offset, params.decay, doc['location'].value)",
  155. "params": {
  156. "origin": "40, -70.12",
  157. "scale": "200km",
  158. "offset": "0km",
  159. "decay" : 0.2
  160. }
  161. }
  162. --------------------------------------------------
  163. // NOTCONSOLE
  164. [[decay-functions-date-fields]]
  165. ====== Decay functions for date fields
  166. * `double decayDateLinear(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
  167. * `double decayDateExp(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
  168. * `double decayDateGauss(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
  169. [source,js]
  170. --------------------------------------------------
  171. "script" : {
  172. "source" : "decayDateGauss(params.origin, params.scale, params.offset, params.decay, doc['date'].value)",
  173. "params": {
  174. "origin": "2008-01-01T01:00:00Z",
  175. "scale": "1h",
  176. "offset" : "0",
  177. "decay" : 0.5
  178. }
  179. }
  180. --------------------------------------------------
  181. // NOTCONSOLE
  182. NOTE: Decay functions on dates are limited to dates in the default format
  183. and default time zone. Also calculations with `now` are not supported.
  184. [[script-score-functions-vector-fields]]
  185. ====== Functions for vector fields
  186. <<vector-functions, Functions for vector fields>> are accessible through
  187. `script_score` query.
  188. ===== Allow expensive queries
  189. Script score queries will not be executed if <<query-dsl-allow-expensive-queries, `search.allow_expensive_queries`>>
  190. is set to false.
  191. [[script-score-faster-alt]]
  192. ===== Faster alternatives
  193. The `script_score` query calculates the score for
  194. every matching document, or hit. There are faster alternative query types that
  195. can efficiently skip non-competitive hits:
  196. * If you want to boost documents on some static fields, use the
  197. <<query-dsl-rank-feature-query, `rank_feature`>> query.
  198. * If you want to boost documents closer to a date or geographic point, use the
  199. <<query-dsl-distance-feature-query, `distance_feature`>> query.
  200. [[script-score-function-score-transition]]
  201. ===== Transition from the function score query
  202. We recommend using the `script_score` query instead of
  203. <<query-dsl-function-score-query, `function_score`>> query for the simplicity
  204. of the `script_score` query.
  205. You can implement the following functions of the `function_score` query using
  206. the `script_score` query:
  207. * <<script-score>>
  208. * <<weight>>
  209. * <<random-score>>
  210. * <<field-value-factor>>
  211. * <<decay-functions>>
  212. [[script-score]]
  213. ====== `script_score`
  214. What you used in `script_score` of the Function Score query, you
  215. can copy into the Script Score query. No changes here.
  216. [[weight]]
  217. ====== `weight`
  218. `weight` function can be implemented in the Script Score query through
  219. the following script:
  220. [source,js]
  221. --------------------------------------------------
  222. "script" : {
  223. "source" : "params.weight * _score",
  224. "params": {
  225. "weight": 2
  226. }
  227. }
  228. --------------------------------------------------
  229. // NOTCONSOLE
  230. [[random-score]]
  231. ====== `random_score`
  232. Use `randomScore` function
  233. as described in <<random-score-function, random score function>>.
  234. [[field-value-factor]]
  235. ====== `field_value_factor`
  236. `field_value_factor` function can be easily implemented through script:
  237. [source,js]
  238. --------------------------------------------------
  239. "script" : {
  240. "source" : "Math.log10(doc['field'].value * params.factor)",
  241. "params" : {
  242. "factor" : 5
  243. }
  244. }
  245. --------------------------------------------------
  246. // NOTCONSOLE
  247. For checking if a document has a missing value, you can use
  248. `doc['field'].size() == 0`. For example, this script will use
  249. a value `1` if a document doesn't have a field `field`:
  250. [source,js]
  251. --------------------------------------------------
  252. "script" : {
  253. "source" : "Math.log10((doc['field'].size() == 0 ? 1 : doc['field'].value()) * params.factor)",
  254. "params" : {
  255. "factor" : 5
  256. }
  257. }
  258. --------------------------------------------------
  259. // NOTCONSOLE
  260. This table lists how `field_value_factor` modifiers can be implemented
  261. through a script:
  262. [cols="<,<",options="header",]
  263. |=======================================================================
  264. | Modifier | Implementation in Script Score
  265. | `none` | -
  266. | `log` | `Math.log10(doc['f'].value)`
  267. | `log1p` | `Math.log10(doc['f'].value + 1)`
  268. | `log2p` | `Math.log10(doc['f'].value + 2)`
  269. | `ln` | `Math.log(doc['f'].value)`
  270. | `ln1p` | `Math.log(doc['f'].value + 1)`
  271. | `ln2p` | `Math.log(doc['f'].value + 2)`
  272. | `square` | `Math.pow(doc['f'].value, 2)`
  273. | `sqrt` | `Math.sqrt(doc['f'].value)`
  274. | `reciprocal` | `1.0 / doc['f'].value`
  275. |=======================================================================
  276. [[decay-functions]]
  277. ====== `decay` functions
  278. The `script_score` query has equivalent <<decay-functions-numeric-fields, decay
  279. functions>> that can be used in scripts.
  280. include::{es-ref-dir}/vectors/vector-functions.asciidoc[]
  281. [[score-explanation]]
  282. ===== Explain request
  283. Using an <<search-explain, explain request>> provides an explanation of how the parts of a score were computed. The `script_score` query can add its own explanation by setting the `explanation` parameter:
  284. [source,console]
  285. --------------------------------------------------
  286. GET /my-index-000001/_explain/0
  287. {
  288. "query": {
  289. "script_score": {
  290. "query": {
  291. "match": { "message": "elasticsearch" }
  292. },
  293. "script": {
  294. "source": """
  295. long count = doc['count'].value;
  296. double normalizedCount = count / 10;
  297. if (explanation != null) {
  298. explanation.set('normalized count = count / 10 = ' + count + ' / 10 = ' + normalizedCount);
  299. }
  300. return normalizedCount;
  301. """
  302. }
  303. }
  304. }
  305. }
  306. --------------------------------------------------
  307. // TEST[setup:my_index]
  308. Note that the `explanation` will be null when using in a normal `_search` request, so having a conditional guard is best practice.