using.asciidoc 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570
  1. [[modules-scripting-using]]
  2. == How to write scripts
  3. Wherever scripting is supported in the {es} APIs, the syntax follows the same
  4. pattern; you specify the language of your script, provide the script logic (or
  5. source, and add parameters that are passed into the script:
  6. [source,js]
  7. -------------------------------------
  8. "script": {
  9. "lang": "...",
  10. "source" | "id": "...",
  11. "params": { ... }
  12. }
  13. -------------------------------------
  14. // NOTCONSOLE
  15. `lang`::
  16. Specifies the language the script is written in. Defaults to `painless`.
  17. `source`, `id`::
  18. The script itself, which you specify as `source` for an inline script or
  19. `id` for a stored script. Use the <<stored-script-apis,stored script APIs>>
  20. to create and manage stored scripts.
  21. `params`::
  22. Specifies any named parameters that are passed into the script as
  23. variables. <<prefer-params,Use parameters>> instead of hard-coded values to decrease compile time.
  24. [discrete]
  25. [[hello-world-script]]
  26. === Write your first script
  27. <<modules-scripting-painless,Painless>> is the default scripting language
  28. for {es}. It is secure, performant, and provides a natural syntax for anyone
  29. with a little coding experience.
  30. A Painless script is structured as one or more statements and optionally
  31. has one or more user-defined functions at the beginning. A script must always
  32. have at least one statement.
  33. The {painless}/painless-execute-api.html[Painless execute API] provides the ability to
  34. test a script with simple user-defined parameters and receive a result. Let's
  35. start with a complete script and review its constituent parts.
  36. First, index a document with a single field so that we have some data to work
  37. with:
  38. [source,console]
  39. ----
  40. PUT my-index-000001/_doc/1
  41. {
  42. "my_field": 5
  43. }
  44. ----
  45. We can then construct a script that operates on that field and run evaluate the
  46. script as part of a query. The following query uses the
  47. <<script-fields,`script_fields`>> parameter of the search API to retrieve a
  48. script valuation. There's a lot happening here, but we'll break it down the
  49. components to understand them individually. For now, you only need to
  50. understand that this script takes `my_field` and operates on it.
  51. [source,console]
  52. ----
  53. GET my-index-000001/_search
  54. {
  55. "script_fields": {
  56. "my_doubled_field": {
  57. "script": { <1>
  58. "source": "doc['my_field'].value * params['multiplier']", <2>
  59. "params": {
  60. "multiplier": 2
  61. }
  62. }
  63. }
  64. }
  65. }
  66. ----
  67. // TEST[continued]
  68. <1> `script` object
  69. <2> `script` source
  70. The `script` is a standard JSON object that defines scripts under most APIs
  71. in {es}. This object requires `source` to define the script itself. The
  72. script doesn't specify a language, so it defaults to Painless.
  73. [discrete]
  74. [[prefer-params]]
  75. === Use parameters in your script
  76. The first time {es} sees a new script, it compiles the script and stores the
  77. compiled version in a cache. Compilation can be a heavy process. Rather than
  78. hard-coding values in your script, pass them as named `params` instead.
  79. For example, in the previous script, we could have just hard coded values and
  80. written a script that is seemingly less complex. We could just retrieve the
  81. first value for `my_field` and then multiply it by `2`:
  82. [source,painless]
  83. ----
  84. "source": "return doc['my_field'].value * 2"
  85. ----
  86. Though it works, this solution is pretty inflexible. We have to modify the
  87. script source to change the multiplier, and {es} has to recompile the script
  88. every time that the multiplier changes.
  89. Instead of hard-coding values, use named `params` to make scripts flexible, and
  90. also reduce compilation time when the script runs. You can now make changes to
  91. the `multiplier` parameter without {es} recompiling the script.
  92. [source,painless]
  93. ----
  94. "source": "doc['my_field'].value * params['multiplier']",
  95. "params": {
  96. "multiplier": 2
  97. }
  98. ----
  99. For most contexts, you can compile up to 75 scripts per 5 minutes by default.
  100. For ingest contexts, the default script compilation rate is unlimited. You
  101. can change these settings dynamically by setting
  102. `script.context.$CONTEXT.max_compilations_rate`. For example, the following
  103. setting limits script compilation to 100 scripts every 10 minutes for the
  104. {painless}/painless-field-context.html[field context]:
  105. [source,js]
  106. ----
  107. script.context.field.max_compilations_rate=100/10m
  108. ----
  109. // NOTCONSOLE
  110. IMPORTANT: If you compile too many unique scripts within a short time, {es}
  111. rejects the new dynamic scripts with a `circuit_breaking_exception` error.
  112. [discrete]
  113. [[script-shorten-syntax]]
  114. === Shorten your script
  115. Using syntactic abilities that are native to Painless, you can reduce verbosity
  116. in your scripts and make them shorter. Here's a simple script that we can make
  117. shorter:
  118. [source,console]
  119. ----
  120. GET my-index-000001/_search
  121. {
  122. "script_fields": {
  123. "my_doubled_field": {
  124. "script": {
  125. "lang": "painless",
  126. "source": "return doc['my_field'].value * params.get('multiplier');",
  127. "params": {
  128. "multiplier": 2
  129. }
  130. }
  131. }
  132. }
  133. }
  134. ----
  135. // TEST[s/^/PUT my-index-000001\n/]
  136. Let's look at a shortened version of the script to see what improvements it
  137. includes over the previous iteration:
  138. [source,console]
  139. ----
  140. GET my-index-000001/_search
  141. {
  142. "script_fields": {
  143. "my_doubled_field": {
  144. "script": {
  145. "source": "doc['my_field'].value * params['multiplier']",
  146. "params": {
  147. "multiplier": 2
  148. }
  149. }
  150. }
  151. }
  152. }
  153. ----
  154. // TEST[s/^/PUT my-index-000001\n/]
  155. This version of the script removes several components and simplifies the syntax
  156. significantly:
  157. * The `lang` declaration. Because Painless is the default language, you don't
  158. need to specify the language if you're writing a Painless script.
  159. * The `return` keyword. Painless automatically uses the final statement in a
  160. script (when possible) to produce a return value in a script context that
  161. requires one.
  162. * The `get` method, which is replaced with brackets `[]`. Painless
  163. uses a shortcut specifically for the `Map` type that allows us to use brackets
  164. instead of the lengthier `get` method.
  165. * The semicolon at the end of the `source` statement. Painless does not
  166. require semicolons for the final statement of a block. However, it does require
  167. them in other cases to remove ambiguity.
  168. Use this abbreviated syntax anywhere that {es} supports scripts, such as
  169. when you're creating <<runtime-mapping-fields,runtime fields>>.
  170. [discrete]
  171. [[script-stored-scripts]]
  172. === Store and retrieve scripts
  173. You can store and retrieve scripts from the cluster state using the
  174. <<stored-script-apis,stored script APIs>>. Stored scripts reduce compilation
  175. time and make searches faster.
  176. NOTE: Unlike regular scripts, stored scripts require that you specify a script
  177. language using the `lang` parameter.
  178. To create a script, use the <<create-stored-script-api,create stored script
  179. API>>. For example, the following request creates a stored script named
  180. `calculate-score`.
  181. [source,console]
  182. ----
  183. POST _scripts/calculate-score
  184. {
  185. "script": {
  186. "lang": "painless",
  187. "source": "Math.log(_score * 2) + params['my_modifier']"
  188. }
  189. }
  190. ----
  191. You can retrieve that script by using the <<get-stored-script-api,get stored
  192. script API>>.
  193. [source,console]
  194. ----
  195. GET _scripts/calculate-score
  196. ----
  197. // TEST[continued]
  198. To use the stored script in a query, include the script `id` in the `script`
  199. declaration:
  200. [source,console]
  201. ----
  202. GET my-index-000001/_search
  203. {
  204. "query": {
  205. "script_score": {
  206. "query": {
  207. "match": {
  208. "message": "some message"
  209. }
  210. },
  211. "script": {
  212. "id": "calculate-score", <1>
  213. "params": {
  214. "my_modifier": 2
  215. }
  216. }
  217. }
  218. }
  219. }
  220. ----
  221. // TEST[setup:my_index]
  222. // TEST[continued]
  223. <1> `id` of the stored script
  224. To delete a stored script, submit a <<delete-stored-script-api,delete stored
  225. script API>> request.
  226. [source,console]
  227. ----
  228. DELETE _scripts/calculate-score
  229. ----
  230. // TEST[continued]
  231. [discrete]
  232. [[scripts-update-scripts]]
  233. === Update documents with scripts
  234. You can use the <<docs-update,update API>> to update documents with a specified
  235. script. The script can update, delete, or skip modifying the document. The
  236. update API also supports passing a partial document, which is merged into the
  237. existing document.
  238. First, let's index a simple document:
  239. [source,console]
  240. ----
  241. PUT my-index-000001/_doc/1
  242. {
  243. "counter" : 1,
  244. "tags" : ["red"]
  245. }
  246. ----
  247. To increment the counter, you can submit an update request with the following
  248. script:
  249. [source,console]
  250. ----
  251. POST my-index-000001/_update/1
  252. {
  253. "script" : {
  254. "source": "ctx._source.counter += params.count",
  255. "lang": "painless",
  256. "params" : {
  257. "count" : 4
  258. }
  259. }
  260. }
  261. ----
  262. // TEST[continued]
  263. Similarly, you can use an update script to add a tag to the list of tags.
  264. Because this is just a list, the tag is added even it exists:
  265. [source,console]
  266. ----
  267. POST my-index-000001/_update/1
  268. {
  269. "script": {
  270. "source": "ctx._source.tags.add(params['tag'])",
  271. "lang": "painless",
  272. "params": {
  273. "tag": "blue"
  274. }
  275. }
  276. }
  277. ----
  278. // TEST[continued]
  279. You can also remove a tag from the list of tags. The `remove` method of a Java
  280. `List` is available in Painless. It takes the index of the element you
  281. want to remove. To avoid a possible runtime error, you first need to make sure
  282. the tag exists. If the list contains duplicates of the tag, this script just
  283. removes one occurrence.
  284. [source,console]
  285. ----
  286. POST my-index-000001/_update/1
  287. {
  288. "script": {
  289. "source": "if (ctx._source.tags.contains(params['tag'])) { ctx._source.tags.remove(ctx._source.tags.indexOf(params['tag'])) }",
  290. "lang": "painless",
  291. "params": {
  292. "tag": "blue"
  293. }
  294. }
  295. }
  296. ----
  297. // TEST[continued]
  298. You can also add and remove fields from a document. For example, this script
  299. adds the field `new_field`:
  300. [source,console]
  301. ----
  302. POST my-index-000001/_update/1
  303. {
  304. "script" : "ctx._source.new_field = 'value_of_new_field'"
  305. }
  306. ----
  307. // TEST[continued]
  308. Conversely, this script removes the field `new_field`:
  309. [source,console]
  310. ----
  311. POST my-index-000001/_update/1
  312. {
  313. "script" : "ctx._source.remove('new_field')"
  314. }
  315. ----
  316. // TEST[continued]
  317. Instead of updating the document, you can also change the operation that is
  318. executed from within the script. For example, this request deletes the document
  319. if the `tags` field contains `green`. Otherwise it does nothing (`noop`):
  320. [source,console]
  321. ----
  322. POST my-index-000001/_update/1
  323. {
  324. "script": {
  325. "source": "if (ctx._source.tags.contains(params['tag'])) { ctx.op = 'delete' } else { ctx.op = 'none' }",
  326. "lang": "painless",
  327. "params": {
  328. "tag": "green"
  329. }
  330. }
  331. }
  332. ----
  333. // TEST[continued]
  334. [[scripts-and-search-speed]]
  335. === Scripts, caching, and search speed
  336. {es} performs a number of optimizations to make using scripts as fast as
  337. possible. One important optimization is a script cache. The compiled script is
  338. placed in a cache so that requests that reference the script do not incur a
  339. compilation penalty.
  340. Cache sizing is important. Your script cache should be large enough to hold all
  341. of the scripts that users need to be accessed concurrently.
  342. If you see a large number of script cache evictions and a rising number of
  343. compilations in <<cluster-nodes-stats,node stats>>, your cache might be too
  344. small.
  345. All scripts are cached by default so that they only need to be recompiled
  346. when updates occur. By default, scripts do not have a time-based expiration.
  347. You can change this behavior by using the `script.context.$CONTEXT.cache_expire` setting.
  348. Use the `script.context.$CONTEXT.cache_max_size` setting to configure the size of the cache.
  349. NOTE: The size of scripts is limited to 65,535 bytes. Set the value of `script.max_size_in_bytes` to increase that soft limit. If your scripts are
  350. really large, then consider using a
  351. <<modules-scripting-engine,native script engine>>.
  352. [discrete]
  353. ==== Improving search speed
  354. Scripts are incredibly useful, but can't use {es}'s index structures or related
  355. optimizations. This relationship can sometimes result in slower search speeds.
  356. If you often use scripts to transform indexed data, you can make search faster
  357. by transforming data during ingest instead. However, that often means slower
  358. index speeds. Let's look at a practical example to illustrate how you can
  359. increase search speed.
  360. When running searches, it's common to sort results by the sum of two values.
  361. For example, consider an index named `my_test_scores` that contains test score
  362. data. This index includes two fields of type `long`:
  363. * `math_score`
  364. * `verbal_score`
  365. You can run a query with a script that adds these values together. There's
  366. nothing wrong with this approach, but the query will be slower because the
  367. script valuation occurs as part of the request. The following request returns
  368. documents where `grad_year` equals `2099`, and sorts by the results by the
  369. valuation of the script.
  370. [source,console]
  371. ----
  372. GET /my_test_scores/_search
  373. {
  374. "query": {
  375. "term": {
  376. "grad_year": "2099"
  377. }
  378. },
  379. "sort": [
  380. {
  381. "_script": {
  382. "type": "number",
  383. "script": {
  384. "source": "doc['math_score'].value + doc['verbal_score'].value"
  385. },
  386. "order": "desc"
  387. }
  388. }
  389. ]
  390. }
  391. ----
  392. // TEST[s/^/PUT my_test_scores\n/]
  393. If you're searching a small index, then including the script as part of your
  394. search query can be a good solution. If you want to make search faster, you can
  395. perform this calculation during ingest and index the sum to a field instead.
  396. First, we'll add a new field to the index named `total_score`, which will
  397. contain sum of the `math_score` and `verbal_score` field values.
  398. [source,console]
  399. ----
  400. PUT /my_test_scores/_mapping
  401. {
  402. "properties": {
  403. "total_score": {
  404. "type": "long"
  405. }
  406. }
  407. }
  408. ----
  409. // TEST[continued]
  410. Next, use an <<ingest,ingest pipeline>> containing the
  411. <<script-processor,script processor>> to calculate the sum of `math_score` and
  412. `verbal_score` and index it in the `total_score` field.
  413. [source,console]
  414. ----
  415. PUT _ingest/pipeline/my_test_scores_pipeline
  416. {
  417. "description": "Calculates the total test score",
  418. "processors": [
  419. {
  420. "script": {
  421. "source": "ctx.total_score = (ctx.math_score + ctx.verbal_score)"
  422. }
  423. }
  424. ]
  425. }
  426. ----
  427. // TEST[continued]
  428. To update existing data, use this pipeline to <<docs-reindex,reindex>> any
  429. documents from `my_test_scores` to a new index named `my_test_scores_2`.
  430. [source,console]
  431. ----
  432. POST /_reindex
  433. {
  434. "source": {
  435. "index": "my_test_scores"
  436. },
  437. "dest": {
  438. "index": "my_test_scores_2",
  439. "pipeline": "my_test_scores_pipeline"
  440. }
  441. }
  442. ----
  443. // TEST[continued]
  444. Continue using the pipeline to index any new documents to `my_test_scores_2`.
  445. [source,console]
  446. ----
  447. POST /my_test_scores_2/_doc/?pipeline=my_test_scores_pipeline
  448. {
  449. "student": "kimchy",
  450. "grad_year": "2099",
  451. "math_score": 1200,
  452. "verbal_score": 800
  453. }
  454. ----
  455. // TEST[continued]
  456. These changes slow the index process, but allow for faster searches. Instead of
  457. using a script, you can sort searches made on `my_test_scores_2` using the
  458. `total_score` field. The response is near real-time! Though this process slows
  459. ingest time, it greatly increases queries at search time.
  460. [source,console]
  461. ----
  462. GET /my_test_scores_2/_search
  463. {
  464. "query": {
  465. "term": {
  466. "grad_year": "2099"
  467. }
  468. },
  469. "sort": [
  470. {
  471. "total_score": {
  472. "order": "desc"
  473. }
  474. }
  475. ]
  476. }
  477. ----
  478. // TEST[continued]
  479. ////
  480. [source,console]
  481. ----
  482. DELETE /_ingest/pipeline/my_test_scores_pipeline
  483. ----
  484. // TEST[continued]
  485. ////
  486. include::dissect-syntax.asciidoc[]
  487. include::grok-syntax.asciidoc[]