Browse Source

[DOCS] Add runtime fields contexts to Painless execute API docs (#72131)

* add runtime fields contexts to execute docs

* Changes for formatting throughout

* Add missing context and context_setup

* Updating runtime field context

* Moving parameters and adopting a more standard API layout

* Update several examples

* Update more examples for runtime context

* Fix links

* Add boolean_field example and remove extraneous headings

* Add example for date_time context

* Remove extra space in TEST

* Updating date_time example

* Incorporating review feedback

* Adding cross links

* Tweaking some language based on feedback

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Jack Conradson 4 years ago
parent
commit
5c6dae6125

+ 16 - 12
docs/painless/painless-contexts/painless-runtime-fields-context.asciidoc

@@ -9,12 +9,17 @@ about how to use runtime fields.
 *Methods*
 
 --
-`emit` (Required)::
+[[runtime-emit-method]]
+// tag::runtime-field-emit[]
+`emit`:: (Required)
         Accepts the values from the script valuation. Scripts can call the
         `emit` method multiple times to emit multiple values.
 +
-IMPORTANT: The `emit` method cannot accept `null` values. Do not call this method if
-the referenced fields do not have any values.
+The `emit` method applies only to scripts used in a 
+<<painless-execute-runtime-context,runtime fields context>>.
++
+IMPORTANT: The `emit` method cannot accept `null` values. Do not call this
+method if the referenced fields do not have any values.
 +
 .Signatures of `emit`
 [%collapsible%open]
@@ -22,16 +27,15 @@ the referenced fields do not have any values.
 The signature for `emit` depends on the `type` of the field.
 
 [horizontal]
-[[emit-boolean]]   `boolean`::   `emit(boolean)`
-[[emit-date]]      `date`::      `emit(long)`
-[[emit-double]]    `double`::    `emit(double)`
-[[emit-geo-point]] `geo_point`:: `emit(double lat, double long)`
-[[emit-ip]]        `ip`::        `emit(String)`
-[[emit-long]]      `long`::      `emit(long)`
-[[emit-keyword]]   `keyword`::   `emit(String)`
-
+`boolean`::   `emit(boolean)`
+`date`::      `emit(long)`
+`double`::    `emit(double)`
+`geo_point`:: `emit(double lat, double long)`
+`ip`::        `emit(String)`
+`long`::      `emit(long)`
+`keyword`::   `emit(String)`
 ====
-
+// end::runtime-field-emit[]
 --
 
 --

+ 0 - 1
docs/painless/painless-guide/index.asciidoc

@@ -12,4 +12,3 @@ include::painless-execute-script.asciidoc[]
 
 include::painless-ingest.asciidoc[]
 
-include::../redirects.asciidoc[]

+ 728 - 69
docs/painless/painless-guide/painless-execute-script.asciidoc

@@ -1,37 +1,143 @@
 [[painless-execute-api]]
 === Painless execute API
-
-experimental[The painless execute api is new and the request / response format may change in a breaking way in the future]
-
-The Painless execute API allows an arbitrary script to be executed and a result to be returned.
-
-[[painless-execute-api-parameters]]
-.Parameters
-[options="header"]
-|======
-| Name              | Required  | Default                | Description
-| `script`          | yes       | -                      | The script to execute.
-| `context`         | no        | `painless_test`        | The context the script should be executed in.
-| `context_setup`   | no        | -                      | Additional parameters to the context.
-|======
-
-==== Contexts
-
-Contexts control how scripts are executed, what variables are available at runtime and what the return type is.
-
-===== Painless test context
-
-The `painless_test` context executes scripts as is and does not add any special parameters.
-The only variable that is available is `params`, which can be used to access user defined values.
-The result of the script is always converted to a string.
-If no context is specified then this context is used by default.
-
-*Example*
-
-Request:
+experimental::[]
+
+The Painless execute API runs a script and returns a result.
+
+[[painless-execute-api-request]]
+==== {api-request-title}
+`POST /_scripts/painless/_execute`
+
+[[painless-execute-api-desc]]
+==== {api-description-title}
+Use this API to build and test scripts, such as when defining a script for a
+{ref}/runtime.html[runtime field]. This API requires very few dependencies, and is
+especially useful if you don't have permissions to write documents on a cluster.
+
+The API uses several _contexts_, which control how scripts are executed, what
+variables are available at runtime, and what the return type is.
+
+Each context requires a script, but additional parameters depend on the context
+you're using for that script.
+
+[[painless-execute-api-request-body]]
+==== {api-request-body-title}
+`script`:: (Required, object)
+The Painless script to execute.
++
+.Properties of `script`
+--
+include::../painless-contexts/painless-runtime-fields-context.asciidoc[tag=runtime-field-emit]
+--
+
+`context`:: (Optional, string)
+The context that the script should run in. Defaults to `painless_test` if no
+context is specified.
++
+.Properties of `context`
+[%collapsible%open]
+====
+`painless_test`::
+The default context if no other context is specified. See 
+<<painless-execute-test,test context>>.
+
+`filter`::
+Treats scripts as if they were run inside a `script` query. See
+<<painless-execute-filter-context,filter context>>.
+
+`score`::
+Treats scripts as if they were run inside a `script_score` function in a
+`function_score` query. See <<painless-execute-core-context,score context>>.
+
+[[painless-execute-runtime-context]]
+.Field contexts
+[%collapsible%open]
+=====
+--
+The following options are specific to the field contexts.
+
+NOTE: Result ordering in the field contexts is not guaranteed.
+--
+
+****
+`boolean_field`:: 
+The context for {ref}/boolean.html[`boolean` fields]. The script returns a `true`
+or `false` response. See
+<<painless-runtime-boolean,boolean_field context>>.
+
+`date_field`:: 
+The context for {ref}/date.html[`date` fields]. `emit` takes a `long` value and
+the script returns a sorted list of dates. See
+<<painless-runtime-datetime,date_time context>>.
+
+`double_field`::
+The context for `double` {ref}/number.html[numeric fields]. The script returns a
+sorted list of `double` values. See
+<<painless-runtime-double,double_field context>>.
+
+`geo_point_field`::
+The context for {ref}/geo-point.html[`geo-point` fields]. `emit` takes a
+`geo-point` value and the script returns coordinates for the geo point. See
+<<painless-runtime-geo,geo_point_field context>>.
+
+`ip_field`::
+The context for {ref}/ip.html[`ip` fields]. The script returns a sorted list of IP
+addresses. See
+<<painless-runtime-ip,ip_field context>>.
+
+`keyword_field`::
+The context for {ref}/keyword.html[`keyword` fields]. The script returns a sorted
+list of `string` values. See
+<<painless-runtime-keyword,keyword_field context>>.
+
+`long_field`::
+The context for `long` {ref}/number.html[numeric fields]. The script returns a
+sorted list of `long` values. See <<painless-runtime-long,long_field context>>.
+****
+=====
+====
+
+`context_setup`:: (Required, object)
+Additional parameters for the `context`.
++
+NOTE: This parameter is required for all contexts except `painless_test`,
+which is the default if no value is provided for `context`.
++
+.Properties of `context_setup`
+[%collapsible%open]
+====
+`document`:: (Required, string)
+Document that's temporarily indexed in-memory and accessible from the script.
+
+`index`:: (Required, string)
+Index containing a mapping that's compatible with the indexed document.
+====
+
+`params`:: (`Map`, read-only)
+Specifies any named parameters that are passed into the script as variables.
+
+`query`:: (Optional, object)
+NOTE: This parameter only applies when `score` is specified as the script 
+`context`.
++
+Use this parameter to specify a query for computing a score. Besides deciding
+whether or not the document matches, the 
+{ref}/query-filter-context.html#query-context[query clause] also calculates a 
+relevance score in the `_score` metadata field.
+
+[[painless-execute-test]]
+==== Test context
+The `painless_test` context runs scripts without additional parameters. The only
+variable that is available is `params`, which can be used to access user defined
+values. The result of the script is always converted to a string.
+
+Because the default context is `painless_test`, you don't need to specify the
+`context` or `context_setup`.
+
+===== Request
 
 [source,console]
-----------------------------------------------------------------
+----
 POST /_scripts/painless/_execute
 {
   "script": {
@@ -42,33 +148,29 @@ POST /_scripts/painless/_execute
     }
   }
 }
-----------------------------------------------------------------
+----
 
-Response:
+===== Response
 
 [source,console-result]
---------------------------------------------------
+----
 {
   "result": "0.1"
 }
---------------------------------------------------
+----
 
-===== Filter context
+[[painless-execute-filter-context]]
+==== Filter context
+The `filter` context treats scripts as if they were run inside a `script` query.
+For testing purposes, a document must be provided so that it will be temporarily
+indexed in-memory and is accessible from the script. More precisely, the
+`_source`, stored fields and doc values of such a document are available to the
+script being tested.
 
-The `filter` context executes scripts as if they were executed inside a `script` query.
-For testing purposes, a document must be provided so that it will be temporarily indexed in-memory and
-is accessible from the script. More precisely, the _source, stored fields and doc values of such a 
-document are available to the script being tested.
-
-The following parameters may be specified in `context_setup` for a filter context:
-
-document:: Contains the document that will be temporarily indexed in-memory and is accessible from the script.
-index:: The name of an index containing a mapping that is compatible with the document being indexed.
-
-*Example*
+===== Request
 
 [source,console]
-----------------------------------------------------------------
+----
 PUT /my-index-000001
 {
   "mappings": {
@@ -79,7 +181,10 @@ PUT /my-index-000001
     }
   }
 }
+----
 
+[source,console]
+----
 POST /_scripts/painless/_execute
 {
   "script": {
@@ -96,33 +201,27 @@ POST /_scripts/painless/_execute
     }
   }
 }
-----------------------------------------------------------------
+----
+// TEST[continued]
 
-Response:
+===== Response
 
 [source,console-result]
---------------------------------------------------
+----
 {
   "result": true
 }
---------------------------------------------------
-
-
-===== Score context
-
-The `score` context executes scripts as if they were executed inside a `script_score` function in
-`function_score` query.
-
-The following parameters may be specified in `context_setup` for a score context:
+----
 
-document:: Contains the document that will be temporarily indexed in-memory and is accessible from the script.
-index:: The name of an index containing a mapping that is compatible with the document being indexed.
-query:: If `_score` is used in the script then a query can specify that it will be used to compute a score.
+[[painless-execute-core-context]]
+==== Score context
+The `score` context treats scripts as if they were run inside a `script_score`
+function in a `function_score` query.
 
-*Example*
+===== Request
 
 [source,console]
-----------------------------------------------------------------
+----
 PUT /my-index-000001
 {
   "mappings": {
@@ -136,8 +235,10 @@ PUT /my-index-000001
     }
   }
 }
+----
 
-
+[source,console]
+----
 POST /_scripts/painless/_execute
 {
   "script": {
@@ -154,13 +255,571 @@ POST /_scripts/painless/_execute
     }
   }
 }
-----------------------------------------------------------------
+----
+// TEST[continued]
 
-Response:
+===== Response
 
 [source,console-result]
---------------------------------------------------
+----
 {
   "result": 0.8
 }
---------------------------------------------------
+----
+
+[[painless-execute-runtime-field-context]]
+==== Field contexts
+The field contexts treat scripts as if they were run inside the
+{ref}/runtime-search-request.html[`runtime_mappings` section] of a search query.
+You can use field contexts to test scripts for different field types, and then
+include those scripts anywhere that they're supported, such as  <<painless-runtime-fields,runtime fields>>.
+
+Choose a field context based on the data type you want to return.
+
+[[painless-runtime-boolean]]
+===== `boolean_field`
+Use the `boolean_field` field context when you want to return a `true`
+or `false` value from a script valuation. {ref}/boolean.html[Boolean fields]
+accept `true` and `false` values, but can also accept strings that are
+interpreted as either true or false. 
+
+Let's say you have data for the top 100 science fiction books of all time. You
+want to write scripts that return a boolean response such as whether books
+exceed a certain page count, or if a book was published after a specific year.
+
+Consider that your data is structured like this:
+
+[source,console]
+----
+PUT /my-index-000001
+{
+  "mappings": {
+    "properties": {
+      "name": {
+        "type": "keyword"
+      },
+      "author": {
+        "type": "keyword"
+      },
+      "release_date": {
+        "type": "date"
+      },
+      "page_count": {
+        "type": "double"
+      }
+    }
+  }
+}
+----
+
+You can then write a script in the `boolean_field` context that indicates
+whether a book was published before the year 1972:
+
+[source,console]
+----
+POST /_scripts/painless/_execute
+{
+  "script": {
+    "source": """
+      emit(doc['release_date'].value.year < 1972);
+    """
+  },
+  "context": "boolean_field",
+  "context_setup": {
+    "index": "my-index-000001",
+    "document": {
+      "name": "Dune",
+      "author": "Frank Herbert",
+      "release_date": "1965-06-01",
+      "page_count": 604
+    }
+  }
+}
+----
+// TEST[continued]
+
+Because _Dune_ was published in 1965, the result returns as `true`:
+
+[source,console-result]
+----
+{
+  "result" : [
+    true
+  ]
+}
+----
+
+Similarly, you could write a script that determines whether the first name of
+an author exceeds a certain number of characters. The following script operates
+on the `author` field to determine whether the author's first name contains at
+least one character, but is less than five characters:
+
+[source,console]
+----
+POST /_scripts/painless/_execute
+{
+  "script": {
+    "source": """
+      int space = doc['author'].value.indexOf(' ');
+      emit(space > 0 && space < 5);
+    """
+  },
+  "context": "boolean_field",
+  "context_setup": {
+    "index": "my-index-000001",
+    "document": {
+      "name": "Dune",
+      "author": "Frank Herbert",
+      "release_date": "1965-06-01",
+      "page_count": 604
+    }
+  }
+}
+----
+// TEST[continued]
+
+Because `Frank` is five characters, the response returns `false` for the script
+valuation:
+
+[source,console-result]
+----
+{
+  "result" : [
+    false
+  ]
+}
+----
+
+[[painless-runtime-datetime]]
+===== `date_time`
+Several options are available for using
+<<painless-datetime,using datetime in Painless>>. In this example, you'll
+estimate when a particular author starting writing a book based on its release
+date and the writing speed of that author. The example makes some assumptions,
+but shows to write a script that operates on a date while incorporating
+additional information.
+
+Add the following fields to your index mapping to get started:
+
+[source,console]
+----
+PUT /my-index-000001
+{
+  "mappings": {
+    "properties": {
+      "name": {
+        "type": "keyword"
+      },
+      "author": {
+        "type": "keyword"
+      },
+      "release_date": {
+        "type": "date"
+      },
+      "page_count": {
+        "type": "long"
+      }
+    }
+  }
+}
+----
+
+The following script makes the incredible assumption that when writing a book,
+authors just write each page and don't do research or revisions. Further, the
+script assumes that the average time it takes to write a page is eight hours.
+
+The script retrieves the `author` and makes another fantastic assumption to
+either divide or multiply the `pageTime` value based on the author's perceived
+writing speed (yet another wild assumption).
+
+The script subtracts the release date value (in milliseconds) from the
+calculation of `pageTime` times the `page_count` to determine approximately
+(based on numerous assumptions) when the author began writing the book.
+
+[source,console]
+----
+POST /_scripts/painless/_execute
+{
+  "script": {
+    "source": """
+      String author = doc['author'].value;
+      long pageTime = 28800000;  <1>
+      if (author == 'Robert A. Heinlein') {
+        pageTime /= 2;           <2>
+      } else if (author == 'Alastair Reynolds') {
+        pageTime *= 2;           <3>
+      }
+      emit(doc['release_date'].value.toInstant().toEpochMilli() - pageTime * doc['page_count'].value);
+    """
+  },
+  "context": "date_field",
+  "context_setup": {
+    "index": "my-index-000001",
+    "document": {
+      "name": "Revelation Space",
+      "author": "Alastair Reynolds",
+      "release_date": "2000-03-15",
+      "page_count": 585
+    }
+  }
+}
+----
+//TEST[continued]
+<1> Eight hours, represented in milliseconds
+<2> Incredibly fast writing from Robert A. Heinlein
+<3> Alastair Reynolds writes space operas at a much slower speed
+
+In this case, the author is Alastair Reynolds. Based on a release date of
+`2000-03-15`, the script calculates that the author started writing
+`Revelation Space` on 19 February 1999. Writing a 585 page book in just over one
+year is pretty impressive!
+
+[source,console-result]
+----
+{
+  "result" : [
+    "1999-02-19T00:00:00.000Z"
+  ]
+}
+----
+
+[[painless-runtime-double]]
+===== `double_field`
+Use the `double_field` context for {ref}/number.html[numeric data] of type
+`double`. For example, let's say you have sensor data that includes a `voltage`
+field with values like 5.6. After indexing millions of documents, you discover
+that the sensor with model number `QVKC92Q` is under reporting its voltage by a
+factor of 1.7. Rather than reindex your data, you can fix it with a
+runtime field.
+
+You need to multiply this value, but only for
+sensors that match a specific model number.
+
+Add the following fields to your index mapping. The `voltage` field is a 
+sub-field of the `measures` object.
+
+[source,console]
+----
+PUT my-index-000001
+{
+  "mappings": {
+    "properties": {
+      "@timestamp": {
+        "type": "date"
+      },
+      "model_number": {
+        "type": "keyword"
+      },
+      "measures": {
+        "properties": {
+          "voltage": {
+            "type": "double"
+          }
+        }
+      }
+    }
+  }
+}
+----
+
+The following script matches on any documents where the `model_number` equals
+`QVKC92Q`, and then multiplies the `voltage` value by `1.7`. This script is
+useful when you want to select specific documents and only operate on values
+that match the specified criteria.
+
+[source,console]
+----
+POST /_scripts/painless/_execute
+{
+  "script": {
+    "source": """
+      if (doc['model_number'].value.equals('QVKC92Q'))
+      {emit(1.7 * params._source['measures']['voltage']);}
+      else{emit(params._source['measures']['voltage']);}
+    """
+  },
+  "context": "double_field",
+  "context_setup": {
+    "index": "my-index-000001",
+    "document": {
+      "@timestamp": 1516470094000,
+      "model_number": "QVKC92Q",
+      "measures": {
+        "voltage": 5.6
+      }
+    }
+  }
+}
+----
+// TEST[continued]
+
+The result includes the calculated voltage, which was determined by multiplying
+the original value of `5.6` by `1.7`:
+
+[source,console-result]
+----
+{
+  "result" : [
+    9.52
+  ]
+}
+----
+
+[[painless-runtime-geo]]
+===== `geo_point_field`
+{ref}/geo-point.html[Geo-point] fields accept latitude-longitude pairs. You can
+define a geo-point field in several ways, and include values for latitude and
+longitude in the document for your script.
+
+If you already have a known geo-point, it's simpler to clearly state the
+positions of `lat` and `lon` in your index mappings. 
+
+[source,console]
+----
+PUT /my-index-000001/
+{
+  "mappings": {
+    "properties": {
+      "lat": {
+        "type": "double"
+      },
+     "lon": {
+        "type": "double"
+      }
+    }
+  }
+}
+----
+
+You can then use the `geo_point_field` runtime field context to write a script
+that retrieves the `lat` and `long` values.
+
+[source,console]
+----
+POST /_scripts/painless/_execute
+{
+  "script": {
+    "source": """
+      emit(doc['lat'].value, doc['lon'].value);
+    """
+  },
+  "context": "geo_point_field",
+  "context_setup": {
+    "index": "my-index-000001",
+    "document": {
+      "lat": 41.12,
+      "lon": -71.34
+    }
+  }
+}
+----
+// TEST[continued]
+
+Because this you're working with a geo-point field type, the response includes
+results that are formatted as `coordinates`. 
+
+[source,console-result]
+----
+{
+  "result" : [
+    {
+      "coordinates" : [
+        -71.34000004269183,
+        41.1199999647215
+      ],
+      "type" : "Point"
+    }
+  ]
+}
+----
+
+[[painless-runtime-ip]]
+===== `ip_field`
+The `ip_field` context is useful for data that includes IP addresses of type
+{ref}/ip.html[`ip`]. For example, let's say you have a `message` field from an Apache
+log. This field contains an IP address, but also other data that you don't need. 
+
+You can add the `message` field to your index mappings as a `wildcard` to accept
+pretty much any data you want to put in that field.
+
+[source,console]
+----
+PUT /my-index-000001/
+{
+  "mappings": {
+    "properties": {
+      "message": {
+        "type": "wildcard"
+      }
+    }
+  }
+}
+----
+
+You can then define a runtime script with a grok pattern that extracts
+structured fields out of the `message` field.
+
+The script matches on the `%{COMMONAPACHELOG}` log pattern, which understands
+the structure of Apache logs. If the pattern matches, the script emits the
+value matching the IP address. If the pattern doesn’t match
+(`clientip != null`), the script just returns the field value without crashing.
+
+[source,console]
+----
+POST /_scripts/painless/_execute
+{
+  "script": {
+    "source": """
+      String clientip=grok('%{COMMONAPACHELOG}').extract(doc["message"].value)?.clientip;
+      if (clientip != null) emit(clientip); 
+    """
+  },
+  "context": "ip_field",
+  "context_setup": {
+    "index": "my-index-000001",
+    "document": {
+      "message": "40.135.0.0 - - [30/Apr/2020:14:30:17 -0500] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"
+    }
+  }
+}
+----
+// TEST[continued]
+
+The response includes only the IP address, ignoring all of the other data in the
+`message` field.
+
+[source,console-result]
+----
+{
+  "result" : [
+    "40.135.0.0"
+  ]
+}
+----
+
+[[painless-runtime-keyword]]
+===== `keyword_field`
+{ref}/keyword.html[Keyword fields] are often used in sorting, aggregations, and
+term-level queries.
+
+Let's say you have a timestamp. You want to calculate the day of the week based
+on that value and return it, such as `Thursday`. The following request adds a
+`@timestamp` field of type `date` to the index mappings:
+
+[source,console]
+----
+PUT /my-index-000001
+{
+  "mappings": {
+    "properties": {
+      "@timestamp": {
+        "type": "date"
+      }
+    }
+  }
+}
+----
+
+To return the equivalent day of week based on your timestamp, you can create a
+script in the `keyword_field` runtime field context:
+
+[source,console]
+----
+POST /_scripts/painless/_execute
+{
+  "script": {
+    "source": """
+      emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT));
+    """
+  },
+  "context": "keyword_field",
+  "context_setup": {
+    "index": "my-index-000001",
+    "document": {
+      "@timestamp": "2020-04-30T14:31:43-05:00"
+    }
+  }
+}
+----
+// TEST[continued]
+
+The script operates on the value provided for the `@timestamp` field to
+calculate and return the day of the week:
+
+[source,console-result]
+----
+{
+  "result" : [
+    "Thursday"
+  ]
+}
+----
+
+[[painless-runtime-long]]
+===== `long_field`
+Let's say you have sensor data that a `measures` object. This object includes
+a `start` and `end` field, and you want to calculate the difference between
+those values.
+
+The following request adds a `measures` object to the mappings with two fields,
+both of type `long`:
+
+[source,console]
+----
+PUT /my-index-000001/
+{
+  "mappings": {
+    "properties": {
+      "measures": {
+        "properties": {
+          "start": {
+            "type": "long"
+          },
+          "end": {
+           "type": "long" 
+          }
+        }
+      }
+    }
+  }
+}
+----
+
+You can then define a script that assigns values to the `start` and `end` fields
+and operate on them. The following script extracts the value for the `end`
+field from the `measures` object and subtracts it from the `start` field:
+
+[source,console]
+----
+POST /_scripts/painless/_execute
+{
+  "script": {
+    "source": """
+      emit(doc['measures.end'].value - doc['measures.start'].value);
+    """
+  },
+  "context": "long_field",
+  "context_setup": {
+    "index": "my-index-000001",
+    "document": {
+      "measures": {
+        "voltage": "4.0",
+        "start": "400",
+        "end": "8625309"
+      }
+    }
+  }
+}
+----
+// TEST[continued]
+
+The response includes the calculated value from the script valuation:
+
+[source,console-result]
+----
+{
+  "result" : [
+    8624909
+  ]
+}
+----