123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564 |
- [[modules-scripting-using]]
- == How to write scripts
- Wherever scripting is supported in the {es} APIs, the syntax follows the same
- pattern; you specify the language of your script, provide the script logic (or
- source, and add parameters that are passed into the script:
- [source,js]
- -------------------------------------
- "script": {
- "lang": "...",
- "source" | "id": "...",
- "params": { ... }
- }
- -------------------------------------
- // NOTCONSOLE
- `lang`::
- Specifies the language the script is written in. Defaults to `painless`.
- `source`, `id`::
- The script itself, which you specify as `source` for an inline script or `id` for a stored script. Use the `_scripts` endpoint to retrieve a stored script. For example, you can create or delete a <<script-stored-scripts,stored script>> by calling `POST _scripts/{id}` and `DELETE _scripts/{id}`.
- `params`::
- Specifies any named parameters that are passed into the script as
- variables. <<prefer-params,Use parameters>> instead of hard-coded values to decrease compile time.
- [discrete]
- [[hello-world-script]]
- === Write your first script
- <<modules-scripting-painless,Painless>> is the default scripting language
- for {es}. It is secure, performant, and provides a natural syntax for anyone
- with a little coding experience.
- A Painless script is structured as one or more statements and optionally
- has one or more user-defined functions at the beginning. A script must always
- have at least one statement.
- The {painless}/painless-execute-api.html[Painless execute API] provides the ability to
- test a script with simple user-defined parameters and receive a result. Let's
- start with a complete script and review its constituent parts.
- First, index a document with a single field so that we have some data to work
- with:
- [source,console]
- ----
- PUT my-index-000001/_doc/1
- {
- "my_field": 5
- }
- ----
- We can then construct a script that operates on that field and run evaluate the
- script as part of a query. The following query uses the
- <<script-fields,`script_fields`>> parameter of the search API to retrieve a
- script valuation. There's a lot happening here, but we'll break it down the
- components to understand them individually. For now, you only need to
- understand that this script takes `my_field` and operates on it.
- [source,console]
- ----
- GET my-index-000001/_search
- {
- "script_fields": {
- "my_doubled_field": {
- "script": { <1>
- "source": "doc['my_field'].value * params['multiplier']", <2>
- "params": {
- "multiplier": 2
- }
- }
- }
- }
- }
- ----
- // TEST[continued]
- <1> `script` object
- <2> `script` source
- The `script` is a standard JSON object that defines scripts under most APIs
- in {es}. This object requires `source` to define the script itself. The
- script doesn't specify a language, so it defaults to Painless.
- [discrete]
- [[prefer-params]]
- === Use parameters in your script
- The first time {es} sees a new script, it compiles the script and stores the
- compiled version in a cache. Compilation can be a heavy process. Rather than
- hard-coding values in your script, pass them as named `params` instead.
- For example, in the previous script, we could have just hard coded values and
- written a script that is seemingly less complex. We could just retrieve the
- first value for `my_field` and then multiply it by `2`:
- [source,painless]
- ----
- "source": "return doc['my_field'].value * 2"
- ----
- Though it works, this solution is pretty inflexible. We have to modify the
- script source to change the multiplier, and {es} has to recompile the script
- every time that the multiplier changes.
- Instead of hard-coding values, use named `params` to make scripts flexible, and
- also reduce compilation time when the script runs. You can now make changes to
- the `multiplier` parameter without {es} recompiling the script.
- [source,painless]
- ----
- "source": "doc['my_field'].value * params['multiplier']",
- "params": {
- "multiplier": 2
- }
- ----
- For most contexts, you can compile up to 75 scripts per 5 minutes by default.
- For ingest contexts, the default script compilation rate is unlimited. You
- can change these settings dynamically by setting
- `script.context.$CONTEXT.max_compilations_rate`. For example, the following
- setting limits script compilation to 100 scripts every 10 minutes for the
- {painless}/painless-field-context.html[field context]:
- [source,js]
- ----
- script.context.field.max_compilations_rate=100/10m
- ----
- // NOTCONSOLE
- IMPORTANT: If you compile too many unique scripts within a short time, {es}
- rejects the new dynamic scripts with a `circuit_breaking_exception` error.
- [discrete]
- [[script-shorten-syntax]]
- === Shorten your script
- Using syntactic abilities that are native to Painless, you can reduce verbosity
- in your scripts and make them shorter. Here's a simple script that we can make
- shorter:
- [source,console]
- ----
- GET my-index-000001/_search
- {
- "script_fields": {
- "my_doubled_field": {
- "script": {
- "lang": "painless",
- "source": "return doc['my_field'].value * params.get('multiplier');",
- "params": {
- "multiplier": 2
- }
- }
- }
- }
- }
- ----
- // TEST[s/^/PUT my-index-000001\n/]
- Let's look at a shortened version of the script to see what improvements it
- includes over the previous iteration:
- [source,console]
- ----
- GET my-index-000001/_search
- {
- "script_fields": {
- "my_doubled_field": {
- "script": {
- "source": "doc['my_field'].value * params['multiplier']",
- "params": {
- "multiplier": 2
- }
- }
- }
- }
- }
- ----
- // TEST[s/^/PUT my-index-000001\n/]
- This version of the script removes several components and simplifies the syntax
- significantly:
- * The `lang` declaration. Because Painless is the default language, you don't
- need to specify the language if you're writing a Painless script.
- * The `return` keyword. Painless automatically uses the final statement in a
- script (when possible) to produce a return value in a script context that
- requires one.
- * The `get` method, which is replaced with brackets `[]`. Painless
- uses a shortcut specifically for the `Map` type that allows us to use brackets
- instead of the lengthier `get` method.
- * The semicolon at the end of the `source` statement. Painless does not
- require semicolons for the final statement of a block. However, it does require
- them in other cases to remove ambiguity.
- Use this abbreviated syntax anywhere that {es} supports scripts, such as
- when you're creating <<runtime-mapping-fields,runtime fields>>.
- [discrete]
- [[script-stored-scripts]]
- === Store and retrieve scripts
- You can store and retrieve scripts from the cluster state using the `_scripts`
- endpoint. Using stored scripts can help to reduce compilation time and make
- searches faster. Use the `{id}` path element in `_scripts/{id}` to refer to a
- stored script.
- NOTE: Unlike regular scripts, stored scripts require that you specify a script
- language using the `lang` parameter.
- For example, let's create a stored script in the cluster named
- `calculate-score`:
- [source,console]
- -----------------------------------
- POST _scripts/calculate-score
- {
- "script": {
- "lang": "painless",
- "source": "Math.log(_score * 2) + params['my_modifier']"
- }
- }
- -----------------------------------
- // TEST[setup:my_index]
- You can retrieve that script by using the `_scripts` endpoint:
- [source,console]
- -----------------------------------
- GET _scripts/calculate-score
- -----------------------------------
- // TEST[continued]
- To use the stored script in a query, include the script `id` in the `script`
- declaration:
- [source,console]
- ----
- GET my-index-000001/_search
- {
- "query": {
- "script_score": {
- "query": {
- "match": {
- "message": "some message"
- }
- },
- "script": {
- "id": "calculate-score", <1>
- "params": {
- "my_modifier": 2
- }
- }
- }
- }
- }
- ----
- // TEST[continued]
- <1> `id` of the stored script
- To delete a stored script, submit a delete request to the `_scripts` endpoint
- and specify the stored script `id`:
- [source,console]
- ----
- DELETE _scripts/calculate-score
- ----
- // TEST[continued]
- [discrete]
- [[scripts-update-scripts]]
- === Update documents with scripts
- You can use the <<docs-update,update API>> to update documents with a specified
- script. The script can update, delete, or skip modifying the document. The
- update API also supports passing a partial document, which is merged into the
- existing document.
- First, let's index a simple document:
- [source,console]
- ----
- PUT my-index-000001/_doc/1
- {
- "counter" : 1,
- "tags" : ["red"]
- }
- ----
- To increment the counter, you can submit an update request with the following
- script:
- [source,console]
- ----
- POST my-index-000001/_update/1
- {
- "script" : {
- "source": "ctx._source.counter += params.count",
- "lang": "painless",
- "params" : {
- "count" : 4
- }
- }
- }
- ----
- // TEST[continued]
- Similarly, you can use an update script to add a tag to the list of tags.
- Because this is just a list, the tag is added even it exists:
- [source,console]
- ----
- POST my-index-000001/_update/1
- {
- "script": {
- "source": "ctx._source.tags.add(params['tag'])",
- "lang": "painless",
- "params": {
- "tag": "blue"
- }
- }
- }
- ----
- // TEST[continued]
- You can also remove a tag from the list of tags. The `remove` method of a Java
- `List` is available in Painless. It takes the index of the element you
- want to remove. To avoid a possible runtime error, you first need to make sure
- the tag exists. If the list contains duplicates of the tag, this script just
- removes one occurrence.
- [source,console]
- ----
- POST my-index-000001/_update/1
- {
- "script": {
- "source": "if (ctx._source.tags.contains(params['tag'])) { ctx._source.tags.remove(ctx._source.tags.indexOf(params['tag'])) }",
- "lang": "painless",
- "params": {
- "tag": "blue"
- }
- }
- }
- ----
- // TEST[continued]
- You can also add and remove fields from a document. For example, this script
- adds the field `new_field`:
- [source,console]
- ----
- POST my-index-000001/_update/1
- {
- "script" : "ctx._source.new_field = 'value_of_new_field'"
- }
- ----
- // TEST[continued]
- Conversely, this script removes the field `new_field`:
- [source,console]
- ----
- POST my-index-000001/_update/1
- {
- "script" : "ctx._source.remove('new_field')"
- }
- ----
- // TEST[continued]
- Instead of updating the document, you can also change the operation that is
- executed from within the script. For example, this request deletes the document
- if the `tags` field contains `green`. Otherwise it does nothing (`noop`):
- [source,console]
- ----
- POST my-index-000001/_update/1
- {
- "script": {
- "source": "if (ctx._source.tags.contains(params['tag'])) { ctx.op = 'delete' } else { ctx.op = 'none' }",
- "lang": "painless",
- "params": {
- "tag": "green"
- }
- }
- }
- ----
- // TEST[continued]
- [[scripts-and-search-speed]]
- === Scripts, caching, and search speed
- {es} performs a number of optimizations to make using scripts as fast as
- possible. One important optimization is a script cache. The compiled script is
- placed in a cache so that requests that reference the script do not incur a
- compilation penalty.
- Cache sizing is important. Your script cache should be large enough to hold all
- of the scripts that users need to be accessed concurrently.
- If you see a large number of script cache evictions and a rising number of
- compilations in <<cluster-nodes-stats,node stats>>, your cache might be too
- small.
- All scripts are cached by default so that they only need to be recompiled
- when updates occur. By default, scripts do not have a time-based expiration.
- You can change this behavior by using the `script.cache.expire` setting.
- Use the `script.cache.max_size` setting to configure the size of the cache.
- NOTE: The size of scripts is limited to 65,535 bytes. Set the value of `script.max_size_in_bytes` to increase that soft limit. If your scripts are
- really large, then consider using a
- <<modules-scripting-engine,native script engine>>.
- [discrete]
- ==== Improving search speed
- Scripts are incredibly useful, but can't use {es}'s index structures or related
- optimizations. This relationship can sometimes result in slower search speeds.
- If you often use scripts to transform indexed data, you can make search faster
- by transforming data during ingest instead. However, that often means slower
- index speeds. Let's look at a practical example to illustrate how you can
- increase search speed.
- When running searches, it's common to sort results by the sum of two values.
- For example, consider an index named `my_test_scores` that contains test score
- data. This index includes two fields of type `long`:
- * `math_score`
- * `verbal_score`
- You can run a query with a script that adds these values together. There's
- nothing wrong with this approach, but the query will be slower because the
- script valuation occurs as part of the request. The following request returns
- documents where `grad_year` equals `2099`, and sorts by the results by the
- valuation of the script.
- [source,console]
- ----
- GET /my_test_scores/_search
- {
- "query": {
- "term": {
- "grad_year": "2099"
- }
- },
- "sort": [
- {
- "_script": {
- "type": "number",
- "script": {
- "source": "doc['math_score'].value + doc['verbal_score'].value"
- },
- "order": "desc"
- }
- }
- ]
- }
- ----
- // TEST[s/^/PUT my_test_scores\n/]
- If you're searching a small index, then including the script as part of your
- search query can be a good solution. If you want to make search faster, you can
- perform this calculation during ingest and index the sum to a field instead.
- First, we'll add a new field to the index named `total_score`, which will
- contain sum of the `math_score` and `verbal_score` field values.
- [source,console]
- ----
- PUT /my_test_scores/_mapping
- {
- "properties": {
- "total_score": {
- "type": "long"
- }
- }
- }
- ----
- // TEST[continued]
- Next, use an <<ingest,ingest pipeline>> containing the
- <<script-processor,script processor>> to calculate the sum of `math_score` and
- `verbal_score` and index it in the `total_score` field.
- [source,console]
- ----
- PUT _ingest/pipeline/my_test_scores_pipeline
- {
- "description": "Calculates the total test score",
- "processors": [
- {
- "script": {
- "source": "ctx.total_score = (ctx.math_score + ctx.verbal_score)"
- }
- }
- ]
- }
- ----
- // TEST[continued]
- To update existing data, use this pipeline to <<docs-reindex,reindex>> any
- documents from `my_test_scores` to a new index named `my_test_scores_2`.
- [source,console]
- ----
- POST /_reindex
- {
- "source": {
- "index": "my_test_scores"
- },
- "dest": {
- "index": "my_test_scores_2",
- "pipeline": "my_test_scores_pipeline"
- }
- }
- ----
- // TEST[continued]
- Continue using the pipeline to index any new documents to `my_test_scores_2`.
- [source,console]
- ----
- POST /my_test_scores_2/_doc/?pipeline=my_test_scores_pipeline
- {
- "student": "kimchy",
- "grad_year": "2099",
- "math_score": 1200,
- "verbal_score": 800
- }
- ----
- // TEST[continued]
- These changes slow the index process, but allow for faster searches. Instead of
- using a script, you can sort searches made on `my_test_scores_2` using the
- `total_score` field. The response is near real-time! Though this process slows
- ingest time, it greatly increases queries at search time.
- [source,console]
- ----
- GET /my_test_scores_2/_search
- {
- "query": {
- "term": {
- "grad_year": "2099"
- }
- },
- "sort": [
- {
- "total_score": {
- "order": "desc"
- }
- }
- ]
- }
- ----
- // TEST[continued]
- ////
- [source,console]
- ----
- DELETE /_ingest/pipeline/my_test_scores_pipeline
- ----
- // TEST[continued]
- ////
|