lqb
/
elasticsearch
mirror of https://gitee.com/mirrors/elasticsearch.git


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161
							[[script-processor]]
=== Script processor
++++
<titleabbrev>Script</titleabbrev>
++++

Runs an inline or stored <<modules-scripting,script>> on incoming documents. The
script runs in the {painless}/painless-ingest-processor-context.html[`ingest`]
context.

The script processor uses the <<scripts-and-search-speed,script cache>> to avoid
recompiling the script for each incoming document. To improve performance,
ensure the script cache is properly sized before using a script processor in
production.

[[script-options]]
.Script options
[options="header"]
|======
| Name        | Required  | Default    | Description
| `lang`      | no        | "painless" | <<scripting-available-languages,Script language>>.
| `id`        | no        | -          | ID of a <<create-stored-script-api,stored script>>.
                                         If no `source` is specified, this parameter is required.
| `source`    | no        | -          | Inline script.
                                         If no `id` is specified, this parameter is required.
| `params`    | no        | -          | Object containing parameters for the script.
include::common-options.asciidoc[]
|======

[discrete]
[[script-processor-access-source-fields]]
==== Access source fields

The script processor parses each incoming document's JSON source fields into a
set of maps, lists, and primitives. To access these fields with a Painless
script, use the
{painless}/painless-operators-reference.html#map-access-operator[map access
operator]: `ctx['my-field']`. You can also use the shorthand `ctx.<my-field>`
syntax.

NOTE: The script processor does not support the `ctx['_source']['my-field']` or
`ctx._source.<my-field>` syntaxes.

The following processor uses a Painless script to extract the `tags` field from
the `env` source field.

[source,console]
----
POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "script": {
          "description": "Extract 'tags' from 'env' field",
          "lang": "painless",
          "source": """
            String[] envSplit = ctx['env'].splitOnToken(params['delimiter']);
            ArrayList tags = new ArrayList();
            tags.add(envSplit[params['position']].trim());
            ctx['tags'] = tags;
          """,
          "params": {
            "delimiter": "-",
            "position": 1
          }
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "env": "es01-prod"
      }
    }
  ]
}
----

The processor produces:

[source,console-result]
----
{
  "docs": [
    {
      "doc": {
        ...
        "_source": {
          "env": "es01-prod",
          "tags": [
            "prod"
          ]
        }
      }
    }
  ]
}
----
// TESTRESPONSE[s/\.\.\./"_index":"_index","_id":"_id","_ingest":{"timestamp":$body.docs.0.doc._ingest.timestamp},/]


[discrete]
[[script-processor-access-metadata-fields]]
==== Access metadata fields

You can also use a script processor to access metadata fields. The following
processor uses a Painless script to set an incoming document's `_index`.

[source,console]
----
POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "script": {
          "description": "Set index based on `lang` field and `dataset` param",
          "lang": "painless",
          "source": """
            ctx['_index'] = ctx['lang'] + '-' + params['dataset'];
          """,
          "params": {
            "dataset": "catalog"
          }
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "generic-index",
      "_source": {
        "lang": "fr"
      }
    }
  ]
}
----

The processor changes the document's `_index` to `fr-catalog` from
`generic-index`.

[source,console-result]
----
{
  "docs": [
    {
      "doc": {
        ...
        "_index": "fr-catalog",
        "_source": {
          "lang": "fr"
        }
      }
    }
  ]
}
----
// TESTRESPONSE[s/\.\.\./"_id":"_id","_ingest":{"timestamp":$body.docs.0.doc._ingest.timestamp},/]