script-score-query.asciidoc 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426
  1. [[query-dsl-script-score-query]]
  2. === Script Score Query
  3. The `script_score` allows you to modify the score of documents that are
  4. retrieved by a query. This can be useful if, for example, a score
  5. function is computationally expensive and it is sufficient to compute
  6. the score on a filtered set of documents.
  7. To use `script_score`, you have to define a query and a script -
  8. a function to be used to compute a new score for each document returned
  9. by the query. For more information on scripting see
  10. <<modules-scripting, scripting documentation>>.
  11. Here is an example of using `script_score` to assign each matched document
  12. a score equal to the number of likes divided by 10:
  13. [source,js]
  14. --------------------------------------------------
  15. GET /_search
  16. {
  17. "query" : {
  18. "script_score" : {
  19. "query" : {
  20. "match": { "message": "elasticsearch" }
  21. },
  22. "script" : {
  23. "source" : "doc['likes'].value / 10 "
  24. }
  25. }
  26. }
  27. }
  28. --------------------------------------------------
  29. // CONSOLE
  30. // TEST[setup:twitter]
  31. NOTE: The values returned from `script_score` cannot be negative. In general,
  32. Lucene requires the scores produced by queries to be non-negative in order to
  33. support certain search optimizations.
  34. ==== Accessing the score of a document within a script
  35. Within a script, you can
  36. {ref}/modules-scripting-fields.html#scripting-score[access]
  37. the `_score` variable which represents the current relevance score of a
  38. document.
  39. ==== Predefined functions within a Painless script
  40. You can use any of the available
  41. <<painless-api-reference, painless functions>> in the painless script.
  42. Besides these functions, there are a number of predefined functions
  43. that can help you with scoring. We suggest you to use them instead of
  44. rewriting equivalent functions of your own, as these functions try
  45. to be the most efficient by using the internal mechanisms.
  46. ===== saturation
  47. `saturation(value,k) = value/(k + value)`
  48. [source,js]
  49. --------------------------------------------------
  50. "script" : {
  51. "source" : "saturation(doc['likes'].value, 1)"
  52. }
  53. --------------------------------------------------
  54. // NOTCONSOLE
  55. ===== sigmoid
  56. `sigmoid(value, k, a) = value^a/ (k^a + value^a)`
  57. [source,js]
  58. --------------------------------------------------
  59. "script" : {
  60. "source" : "sigmoid(doc['likes'].value, 2, 1)"
  61. }
  62. --------------------------------------------------
  63. // NOTCONSOLE
  64. [role="xpack"]
  65. [testenv="basic"]
  66. [[vector-functions]]
  67. ===== Functions for vector fields
  68. experimental[]
  69. These functions are used for
  70. for <<dense-vector,`dense_vector`>> and
  71. <<sparse-vector,`sparse_vector`>> fields.
  72. NOTE: During vector functions' calculation, all matched documents are
  73. linearly scanned. Thus, expect the query time grow linearly
  74. with the number of matched documents. For this reason, we recommend
  75. to limit the number of matched documents with a `query` parameter.
  76. For dense_vector fields, `cosineSimilarity` calculates the measure of
  77. cosine similarity between a given query vector and document vectors.
  78. [source,js]
  79. --------------------------------------------------
  80. {
  81. "query": {
  82. "script_score": {
  83. "query": {
  84. "match_all": {}
  85. },
  86. "script": {
  87. "source": "cosineSimilarity(params.query_vector, doc['my_dense_vector']) + 1.0", <1>
  88. "params": {
  89. "query_vector": [4, 3.4, -0.2] <2>
  90. }
  91. }
  92. }
  93. }
  94. }
  95. --------------------------------------------------
  96. // NOTCONSOLE
  97. <1> The script adds 1.0 to the cosine similarity to prevent the score from being negative.
  98. <2> To take advantage of the script optimizations, provide a query vector as a script parameter.
  99. Similarly, for sparse_vector fields, `cosineSimilaritySparse` calculates cosine similarity
  100. between a given query vector and document vectors.
  101. [source,js]
  102. --------------------------------------------------
  103. {
  104. "query": {
  105. "script_score": {
  106. "query": {
  107. "match_all": {}
  108. },
  109. "script": {
  110. "source": "cosineSimilaritySparse(params.query_vector, doc['my_sparse_vector']) + 1.0",
  111. "params": {
  112. "query_vector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
  113. }
  114. }
  115. }
  116. }
  117. }
  118. --------------------------------------------------
  119. // NOTCONSOLE
  120. For dense_vector fields, `dotProduct` calculates the measure of
  121. dot product between a given query vector and document vectors.
  122. [source,js]
  123. --------------------------------------------------
  124. {
  125. "query": {
  126. "script_score": {
  127. "query": {
  128. "match_all": {}
  129. },
  130. "script": {
  131. "source": """
  132. double value = dotProduct(params.query_vector, doc['my_vector']);
  133. return sigmoid(1, Math.E, -value); <1>
  134. """,
  135. "params": {
  136. "query_vector": [4, 3.4, -0.2]
  137. }
  138. }
  139. }
  140. }
  141. }
  142. --------------------------------------------------
  143. // NOTCONSOLE
  144. <1> Using the standard sigmoid function prevents scores from being negative.
  145. Similarly, for sparse_vector fields, `dotProductSparse` calculates dot product
  146. between a given query vector and document vectors.
  147. [source,js]
  148. --------------------------------------------------
  149. {
  150. "query": {
  151. "script_score": {
  152. "query": {
  153. "match_all": {}
  154. },
  155. "script": {
  156. "source": """
  157. double value = dotProductSparse(params.query_vector, doc['my_sparse_vector']);
  158. return sigmoid(1, Math.E, -value);
  159. """,
  160. "params": {
  161. "query_vector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
  162. }
  163. }
  164. }
  165. }
  166. }
  167. --------------------------------------------------
  168. // NOTCONSOLE
  169. NOTE: If a document doesn't have a value for a vector field on which
  170. a vector function is executed, an error will be thrown.
  171. You can check if a document has a value for the field `my_vector` by
  172. `doc['my_vector'].size() == 0`. Your overall script can look like this:
  173. [source,js]
  174. --------------------------------------------------
  175. "source": "doc['my_vector'].size() == 0 ? 0 : cosineSimilarity(params.queryVector, doc['my_vector'])"
  176. --------------------------------------------------
  177. // NOTCONSOLE
  178. NOTE: If a document's dense vector field has a number of dimensions
  179. different from the query's vector, an error will be thrown.
  180. [[random-score-function]]
  181. ===== Random score function
  182. `random_score` function generates scores that are uniformly distributed
  183. from 0 up to but not including 1.
  184. `randomScore` function has the following syntax:
  185. `randomScore(<seed>, <fieldName>)`.
  186. It has a required parameter - `seed` as an integer value,
  187. and an optional parameter - `fieldName` as a string value.
  188. [source,js]
  189. --------------------------------------------------
  190. "script" : {
  191. "source" : "randomScore(100, '_seq_no')"
  192. }
  193. --------------------------------------------------
  194. // NOTCONSOLE
  195. If the `fieldName` parameter is omitted, the internal Lucene
  196. document ids will be used as a source of randomness. This is very efficient,
  197. but unfortunately not reproducible since documents might be renumbered
  198. by merges.
  199. [source,js]
  200. --------------------------------------------------
  201. "script" : {
  202. "source" : "randomScore(100)"
  203. }
  204. --------------------------------------------------
  205. // NOTCONSOLE
  206. Note that documents that are within the same shard and have the
  207. same value for field will get the same score, so it is usually desirable
  208. to use a field that has unique values for all documents across a shard.
  209. A good default choice might be to use the `_seq_no`
  210. field, whose only drawback is that scores will change if the document is
  211. updated since update operations also update the value of the `_seq_no` field.
  212. [[decay-functions-numeric-fields]]
  213. ===== Decay functions for numeric fields
  214. You can read more about decay functions
  215. {ref}/query-dsl-function-score-query.html#function-decay[here].
  216. * `double decayNumericLinear(double origin, double scale, double offset, double decay, double docValue)`
  217. * `double decayNumericExp(double origin, double scale, double offset, double decay, double docValue)`
  218. * `double decayNumericGauss(double origin, double scale, double offset, double decay, double docValue)`
  219. [source,js]
  220. --------------------------------------------------
  221. "script" : {
  222. "source" : "decayNumericLinear(params.origin, params.scale, params.offset, params.decay, doc['dval'].value)",
  223. "params": { <1>
  224. "origin": 20,
  225. "scale": 10,
  226. "decay" : 0.5,
  227. "offset" : 0
  228. }
  229. }
  230. --------------------------------------------------
  231. // NOTCONSOLE
  232. <1> Using `params` allows to compile the script only once, even if params change.
  233. ===== Decay functions for geo fields
  234. * `double decayGeoLinear(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
  235. * `double decayGeoExp(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
  236. * `double decayGeoGauss(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
  237. [source,js]
  238. --------------------------------------------------
  239. "script" : {
  240. "source" : "decayGeoExp(params.origin, params.scale, params.offset, params.decay, doc['location'].value)",
  241. "params": {
  242. "origin": "40, -70.12",
  243. "scale": "200km",
  244. "offset": "0km",
  245. "decay" : 0.2
  246. }
  247. }
  248. --------------------------------------------------
  249. // NOTCONSOLE
  250. ===== Decay functions for date fields
  251. * `double decayDateLinear(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
  252. * `double decayDateExp(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
  253. * `double decayDateGauss(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
  254. [source,js]
  255. --------------------------------------------------
  256. "script" : {
  257. "source" : "decayDateGauss(params.origin, params.scale, params.offset, params.decay, doc['date'].value)",
  258. "params": {
  259. "origin": "2008-01-01T01:00:00Z",
  260. "scale": "1h",
  261. "offset" : "0",
  262. "decay" : 0.5
  263. }
  264. }
  265. --------------------------------------------------
  266. // NOTCONSOLE
  267. NOTE: Decay functions on dates are limited to dates in the default format
  268. and default time zone. Also calculations with `now` are not supported.
  269. ==== Faster alternatives
  270. Script Score Query calculates the score for every hit (matching document).
  271. There are faster alternative query types that can efficiently skip
  272. non-competitive hits:
  273. * If you want to boost documents on some static fields, use
  274. <<query-dsl-rank-feature-query, Rank Feature Query>>.
  275. ==== Transition from Function Score Query
  276. We are deprecating <<query-dsl-function-score-query, Function Score>>, and
  277. Script Score Query will be a substitute for it.
  278. Here we describe how Function Score Query's functions can be
  279. equivalently implemented in Script Score Query:
  280. [[script-score]]
  281. ===== `script_score`
  282. What you used in `script_score` of the Function Score query, you
  283. can copy into the Script Score query. No changes here.
  284. [[weight]]
  285. ===== `weight`
  286. `weight` function can be implemented in the Script Score query through
  287. the following script:
  288. [source,js]
  289. --------------------------------------------------
  290. "script" : {
  291. "source" : "params.weight * _score",
  292. "params": {
  293. "weight": 2
  294. }
  295. }
  296. --------------------------------------------------
  297. // NOTCONSOLE
  298. [[random-score]]
  299. ===== `random_score`
  300. Use `randomScore` function
  301. as described in <<random-score-function, random score function>>.
  302. [[field-value-factor]]
  303. ===== `field_value_factor`
  304. `field_value_factor` function can be easily implemented through script:
  305. [source,js]
  306. --------------------------------------------------
  307. "script" : {
  308. "source" : "Math.log10(doc['field'].value * params.factor)",
  309. params" : {
  310. "factor" : 5
  311. }
  312. }
  313. --------------------------------------------------
  314. // NOTCONSOLE
  315. For checking if a document has a missing value, you can use
  316. `doc['field'].size() == 0`. For example, this script will use
  317. a value `1` if a document doesn't have a field `field`:
  318. [source,js]
  319. --------------------------------------------------
  320. "script" : {
  321. "source" : "Math.log10((doc['field'].size() == 0 ? 1 : doc['field'].value()) * params.factor)",
  322. params" : {
  323. "factor" : 5
  324. }
  325. }
  326. --------------------------------------------------
  327. // NOTCONSOLE
  328. This table lists how `field_value_factor` modifiers can be implemented
  329. through a script:
  330. [cols="<,<",options="header",]
  331. |=======================================================================
  332. | Modifier | Implementation in Script Score
  333. | `none` | -
  334. | `log` | `Math.log10(doc['f'].value)`
  335. | `log1p` | `Math.log10(doc['f'].value + 1)`
  336. | `log2p` | `Math.log10(doc['f'].value + 2)`
  337. | `ln` | `Math.log(doc['f'].value)`
  338. | `ln1p` | `Math.log(doc['f'].value + 1)`
  339. | `ln2p` | `Math.log(doc['f'].value + 2)`
  340. | `square` | `Math.pow(doc['f'].value, 2)`
  341. | `sqrt` | `Math.sqrt(doc['f'].value)`
  342. | `reciprocal` | `1.0 / doc['f'].value`
  343. |=======================================================================
  344. [[decay-functions]]
  345. ===== `decay functions`
  346. Script Score query has equivalent <<decay-functions, decay functions>>
  347. that can be used in script.