123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520 |
- [[modules-scripting]]
- == Scripting
- The scripting module allows to use scripts in order to evaluate custom
- expressions. For example, scripts can be used to return "script fields"
- as part of a search request, or can be used to evaluate a custom score
- for a query and so on.
- deprecated[1.3.0,Mvel has been deprecated and will be removed in 1.4.0]
- added[1.3.0,Groovy scripting support]
- The scripting module uses by default http://groovy.codehaus.org/[groovy]
- (previously http://mvel.codehaus.org/[mvel] in 1.3.x and earlier) as the
- scripting language with some extensions. Groovy is used since it is extremely
- fast and very simple to use.
- Additional `lang` plugins are provided to allow to execute scripts in
- different languages. Currently supported plugins are `lang-javascript`
- for JavaScript, `lang-mvel` for Mvel, and `lang-python` for Python.
- All places where a `script` parameter can be used, a `lang` parameter
- (on the same level) can be provided to define the language of the
- script. The `lang` options are `groovy`, `js`, `mvel`, `python`,
- `expression` and `native`.
- added[1.2.0, Dynamic scripting is disabled for non-sandboxed languages by default since version 1.2.0]
- To increase security, Elasticsearch does not allow you to specify scripts for
- non-sandboxed languages with a request. Instead, scripts must be placed in the
- `scripts` directory inside the configuration directory (the directory where
- elasticsearch.yml is). Scripts placed into this directory will automatically be
- picked up and be available to be used. Once a script has been placed in this
- directory, it can be referenced by name. For example, a script called
- `calculate-score.groovy` can be referenced in a request like this:
- [source,sh]
- --------------------------------------------------
- $ tree config
- config
- ├── elasticsearch.yml
- ├── logging.yml
- └── scripts
- └── calculate-score.groovy
- --------------------------------------------------
- [source,sh]
- --------------------------------------------------
- $ cat config/scripts/calculate-score.groovy
- log(_score * 2) + my_modifier
- --------------------------------------------------
- [source,js]
- --------------------------------------------------
- curl -XPOST localhost:9200/_search -d '{
- "query": {
- "function_score": {
- "query": {
- "match": {
- "body": "foo"
- }
- },
- "functions": [
- {
- "script_score": {
- "script": "calculate-score",
- "params": {
- "my_modifier": 8
- }
- }
- }
- ]
- }
- }
- }'
- --------------------------------------------------
- The name of the script is derived from the hierarchy of directories it
- exists under, and the file name without the lang extension. For example,
- a script placed under `config/scripts/group1/group2/test.py` will be
- named `group1_group2_test`.
- [float]
- === Indexed Scripts
- If dynamic scripting is enabled, Elasticsearch allows you to store scripts
- in an internal index known as `.scripts` and reference them by id. There are
- REST endpoints to manage indexed scripts as follows:
- Requests to the scripts endpoint look like :
- [source,js]
- -----------------------------------
- /_scripts/{lang}/{id}
- -----------------------------------
- Where the `lang` part is the language the script is in and the `id` part is the id
- of the script. In the `.scripts` index the type of the document will be set to the `lang`.
- [source,js]
- -----------------------------------
- curl -XPOST localhost:9200/_scripts/groovy/indexedCalculateScore -d '{
- "script": "log(_score * 2) + my_modifier"
- }'
- -----------------------------------
- This will create a document with id: `indexedCalculateScore` and type: `groovy` in the
- `.scripts` index. The type of the document is the language used by the script.
- This script can be accessed at query time by appending `_id` to
- the script parameter and passing the script id. So `script` becomes `script_id`.:
- [source,js]
- --------------------------------------------------
- curl -XPOST localhost:9200/_search -d '{
- "query": {
- "function_score": {
- "query": {
- "match": {
- "body": "foo"
- }
- },
- "functions": [
- {
- "script_score": {
- "script_id": "indexedCalculateScore",
- "lang" : "groovy",
- "params": {
- "my_modifier": 8
- }
- }
- }
- ]
- }
- }
- }'
- --------------------------------------------------
- Note that you must have dynamic scripting enabled to use indexed scripts
- at query time.
- The script can be viewed by:
- [source,js]
- -----------------------------------
- curl -XGET localhost:9200/_scripts/groovy/indexedCalculateScore
- -----------------------------------
- This is rendered as:
- [source,js]
- -----------------------------------
- '{
- "script": "log(_score * 2) + my_modifier"
- }'
- -----------------------------------
- Indexed scripts can be deleted by:
- [source,js]
- -----------------------------------
- curl -XDELETE localhost:9200/_scripts/groovy/indexedCalculateScore
- -----------------------------------
- [float]
- === Enabling dynamic scripting
- We recommend running Elasticsearch behind an application or proxy, which
- protects Elasticsearch from the outside world. If users are allowed to run
- dynamic scripts (even in a search request), then they have the same access to
- your box as the user that Elasticsearch is running as. For this reason dynamic
- scripting is allowed only for sandboxed languages by default.
- First, you should not run Elasticsearch as the `root` user, as this would allow
- a script to access or do *anything* on your server, without limitations. Second,
- you should not expose Elasticsearch directly to users, but instead have a proxy
- application inbetween. If you *do* intend to expose Elasticsearch directly to
- your users, then you have to decide whether you trust them enough to run scripts
- on your box or not. If you do, you can enable dynamic scripting by adding the
- following setting to the `config/elasticsearch.yml` file on every node:
- [source,yaml]
- -----------------------------------
- script.disable_dynamic: false
- -----------------------------------
- While this still allows execution of named scripts provided in the config, or
- _native_ Java scripts registered through plugins, it also allows users to run
- arbitrary scripts via the API. Instead of sending the name of the file as the
- script, the body of the script can be sent instead.
- There are three possible configuration values for the `script.disable_dynamic`
- setting, the default value is `sandbox`:
- [cols="<,<",options="header",]
- |=======================================================================
- |Value |Description
- | `true` |all dynamic scripting is disabled, scripts must be placed in the `config/scripts` directory.
- | `false` |all dynamic scripting is enabled, scripts may be sent as strings in requests.
- | `sandbox` |scripts may be sent as strings for languages that are sandboxed.
- |=======================================================================
- [float]
- === Default Scripting Language
- The default scripting language (assuming no `lang` parameter is provided) is
- `groovy`. In order to change it, set the `script.default_lang` to the
- appropriate language.
- [float]
- === Groovy Sandboxing
- Elasticsearch sandboxes Groovy scripts that are compiled and executed in order
- to ensure they don't perform unwanted actions. There are a number of options
- that can be used for configuring this sandbox:
- `script.groovy.sandbox.receiver_whitelist`::
- Comma-separated list of string classes for objects that may have methods
- invoked.
- `script.groovy.sandbox.package_whitelist`::
- Comma-separated list of packages under which new objects may be constructed.
- `script.groovy.sandbox.class_whitelist`::
- Comma-separated list of classes that are allowed to be constructed.
- `script.groovy.sandbox.method_blacklist`::
- Comma-separated list of methods that are never allowed to be invoked,
- regardless of target object.
- `script.groovy.sandbox.enabled`::
- Flag to disable the sandbox (defaults to `true` meaning the sandbox is
- enabled).
- When specifying whitelist or blacklist settings for the groovy sandbox, all
- options replace the current whitelist, they are not additive.
-
- [float]
- === Automatic Script Reloading
- The `config/scripts` directory is scanned periodically for changes.
- New and changed scripts are reloaded and deleted script are removed
- from preloaded scripts cache. The reload frequency can be specified
- using `watcher.interval` setting, which defaults to `60s`.
- To disable script reloading completely set `script.auto_reload_enabled`
- to `false`.
- [[native-java-scripts]]
- [float]
- === Native (Java) Scripts
- Even though `groovy` is pretty fast, this allows to register native Java based
- scripts for faster execution.
- In order to allow for scripts, the `NativeScriptFactory` needs to be
- implemented that constructs the script that will be executed. There are
- two main types, one that extends `AbstractExecutableScript` and one that
- extends `AbstractSearchScript` (probably the one most users will extend,
- with additional helper classes in `AbstractLongSearchScript`,
- `AbstractDoubleSearchScript`, and `AbstractFloatSearchScript`).
- Registering them can either be done by settings, for example:
- `script.native.my.type` set to `sample.MyNativeScriptFactory` will
- register a script named `my`. Another option is in a plugin, access
- `ScriptModule` and call `registerScript` on it.
- Executing the script is done by specifying the `lang` as `native`, and
- the name of the script as the `script`.
- Note, the scripts need to be in the classpath of elasticsearch. One
- simple way to do it is to create a directory under plugins (choose a
- descriptive name), and place the jar / classes files there. They will be
- automatically loaded.
- [float]
- === Lucene Expressions Scripts
- [WARNING]
- ========================
- This feature is *experimental* and subject to change in future versions.
- ========================
- Lucene's expressions module provides a mechanism to compile a
- `javascript` expression to bytecode. This allows very fast execution,
- as if you had written a `native` script. Expression scripts can be
- used in `script_score`, `script_fields`, sort scripts and numeric aggregation scripts.
- See the link:http://lucene.apache.org/core/4_9_0/expressions/index.html?org/apache/lucene/expressions/js/package-summary.html[expressions module documentation]
- for details on what operators and functions are available.
- Variables in `expression` scripts are available to access:
- * Single valued document fields, e.g. `doc['myfield'].value`
- * Parameters passed into the script, e.g. `mymodifier`
- * The current document's score, `_score` (only available when used in a `script_score`)
- There are a few limitations relative to other script languages:
- * Only numeric fields may be accessed
- * Stored fields are not available
- * If a field is sparse (only some documents contain a value), documents missing the field will have a value of `0`
- [float]
- === Score
- In all scripts that can be used in aggregations, the current
- document's score is accessible in `doc.score`. When using a `script_score`,
- the current score is available in `_score`.
- [float]
- === Computing scores based on terms in scripts
- see <<modules-advanced-scripting, advanced scripting documentation>>
- [float]
- === Document Fields
- Most scripting revolve around the use of specific document fields data.
- The `doc['field_name']` can be used to access specific field data within
- a document (the document in question is usually derived by the context
- the script is used). Document fields are very fast to access since they
- end up being loaded into memory (all the relevant field values/tokens
- are loaded to memory).
- The following data can be extracted from a field:
- [cols="<,<",options="header",]
- |=======================================================================
- |Expression |Description
- |`doc['field_name'].value` |The native value of the field. For example,
- if its a short type, it will be short.
- |`doc['field_name'].values` |The native array values of the field. For
- example, if its a short type, it will be short[]. Remember, a field can
- have several values within a single doc. Returns an empty array if the
- field has no values.
- |`doc['field_name'].empty` |A boolean indicating if the field has no
- values within the doc.
- |`doc['field_name'].multiValued` |A boolean indicating that the field
- has several values within the corpus.
- |`doc['field_name'].lat` |The latitude of a geo point type.
- |`doc['field_name'].lon` |The longitude of a geo point type.
- |`doc['field_name'].lats` |The latitudes of a geo point type.
- |`doc['field_name'].lons` |The longitudes of a geo point type.
- |`doc['field_name'].distance(lat, lon)` |The `plane` distance (in meters)
- of this geo point field from the provided lat/lon.
- |`doc['field_name'].distanceWithDefault(lat, lon, default)` |The `plane` distance (in meters)
- of this geo point field from the provided lat/lon with a default value.
- |`doc['field_name'].distanceInMiles(lat, lon)` |The `plane` distance (in
- miles) of this geo point field from the provided lat/lon.
- |`doc['field_name'].distanceInMilesWithDefault(lat, lon, default)` |The `plane` distance (in
- miles) of this geo point field from the provided lat/lon with a default value.
- |`doc['field_name'].distanceInKm(lat, lon)` |The `plane` distance (in
- km) of this geo point field from the provided lat/lon.
- |`doc['field_name'].distanceInKmWithDefault(lat, lon, default)` |The `plane` distance (in
- km) of this geo point field from the provided lat/lon with a default value.
- |`doc['field_name'].arcDistance(lat, lon)` |The `arc` distance (in
- meters) of this geo point field from the provided lat/lon.
- |`doc['field_name'].arcDistanceWithDefault(lat, lon, default)` |The `arc` distance (in
- meters) of this geo point field from the provided lat/lon with a default value.
- |`doc['field_name'].arcDistanceInMiles(lat, lon)` |The `arc` distance (in
- miles) of this geo point field from the provided lat/lon.
- |`doc['field_name'].arcDistanceInMilesWithDefault(lat, lon, default)` |The `arc` distance (in
- miles) of this geo point field from the provided lat/lon with a default value.
- |`doc['field_name'].arcDistanceInKm(lat, lon)` |The `arc` distance (in
- km) of this geo point field from the provided lat/lon.
- |`doc['field_name'].arcDistanceInKmWithDefault(lat, lon, default)` |The `arc` distance (in
- km) of this geo point field from the provided lat/lon with a default value.
- |`doc['field_name'].factorDistance(lat, lon)` |The distance factor of this geo point field from the provided lat/lon.
- |`doc['field_name'].factorDistance(lat, lon, default)` |The distance factor of this geo point field from the provided lat/lon with a default value.
- |`doc['field_name'].geohashDistance(geohash)` |The `arc` distance (in meters)
- of this geo point field from the provided geohash.
- |`doc['field_name'].geohashDistanceInKm(geohash)` |The `arc` distance (in km)
- of this geo point field from the provided geohash.
- |`doc['field_name'].geohashDistanceInMiles(geohash)` |The `arc` distance (in
- miles) of this geo point field from the provided geohash.
- |=======================================================================
- [float]
- === Stored Fields
- Stored fields can also be accessed when executing a script. Note, they
- are much slower to access compared with document fields, as they are not
- loaded into memory. They can be simply accessed using
- `_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
- [float]
- === Source Field
- The source field can also be accessed when executing a script. The
- source field is loaded per doc, parsed, and then provided to the script
- for evaluation. The `_source` forms the context under which the source
- field can be accessed, for example `_source.obj2.obj1.field3`.
- Accessing `_source` is much slower compared to using `_doc`
- but the data is not loaded into memory. For a single field access `_fields` may be
- faster than using `_source` due to the extra overhead of potentially parsing large documents.
- However, `_source` may be faster if you access multiple fields or if the source has already been
- loaded for other purposes.
- [float]
- === Groovy Built In Functions
- There are several built in functions that can be used within scripts.
- They include:
- [cols="<,<",options="header",]
- |=======================================================================
- |Function |Description
- |`sin(a)` |Returns the trigonometric sine of an angle.
- |`cos(a)` |Returns the trigonometric cosine of an angle.
- |`tan(a)` |Returns the trigonometric tangent of an angle.
- |`asin(a)` |Returns the arc sine of a value.
- |`acos(a)` |Returns the arc cosine of a value.
- |`atan(a)` |Returns the arc tangent of a value.
- |`toRadians(angdeg)` |Converts an angle measured in degrees to an
- approximately equivalent angle measured in radians
- |`toDegrees(angrad)` |Converts an angle measured in radians to an
- approximately equivalent angle measured in degrees.
- |`exp(a)` |Returns Euler's number _e_ raised to the power of value.
- |`log(a)` |Returns the natural logarithm (base _e_) of a value.
- |`log10(a)` |Returns the base 10 logarithm of a value.
- |`sqrt(a)` |Returns the correctly rounded positive square root of a
- value.
- |`cbrt(a)` |Returns the cube root of a double value.
- |`IEEEremainder(f1, f2)` |Computes the remainder operation on two
- arguments as prescribed by the IEEE 754 standard.
- |`ceil(a)` |Returns the smallest (closest to negative infinity) value
- that is greater than or equal to the argument and is equal to a
- mathematical integer.
- |`floor(a)` |Returns the largest (closest to positive infinity) value
- that is less than or equal to the argument and is equal to a
- mathematical integer.
- |`rint(a)` |Returns the value that is closest in value to the argument
- and is equal to a mathematical integer.
- |`atan2(y, x)` |Returns the angle _theta_ from the conversion of
- rectangular coordinates (_x_, _y_) to polar coordinates (r,_theta_).
- |`pow(a, b)` |Returns the value of the first argument raised to the
- power of the second argument.
- |`round(a)` |Returns the closest _int_ to the argument.
- |`random()` |Returns a random _double_ value.
- |`abs(a)` |Returns the absolute value of a value.
- |`max(a, b)` |Returns the greater of two values.
- |`min(a, b)` |Returns the smaller of two values.
- |`ulp(d)` |Returns the size of an ulp of the argument.
- |`signum(d)` |Returns the signum function of the argument.
- |`sinh(x)` |Returns the hyperbolic sine of a value.
- |`cosh(x)` |Returns the hyperbolic cosine of a value.
- |`tanh(x)` |Returns the hyperbolic tangent of a value.
- |`hypot(x, y)` |Returns sqrt(_x2_ + _y2_) without intermediate overflow
- or underflow.
- |=======================================================================
- [float]
- === Arithmetic precision in MVEL
- When dividing two numbers using MVEL based scripts, the engine tries to
- be smart and adheres to the default behaviour of java. This means if you
- divide two integers (you might have configured the fields as integer in
- the mapping), the result will also be an integer. This means, if a
- calculation like `1/num` is happening in your scripts and `num` is an
- integer with the value of `8`, the result is `0` even though you were
- expecting it to be `0.125`. You may need to enforce precision by
- explicitly using a double like `1.0/num` in order to get the expected
- result.
|