|
@@ -1,33 +1,5 @@
|
|
|
-[[ingest]]
|
|
|
-== Ingest Node
|
|
|
-
|
|
|
-Ingest node can be used to pre-process documents before the actual indexing takes place.
|
|
|
-This pre-processing happens by an ingest node that intercepts bulk and index requests, applies the
|
|
|
-transformations and then passes the documents back to the index or bulk APIs.
|
|
|
-
|
|
|
-Ingest node is enabled by default. In order to disable ingest the following
|
|
|
-setting should be configured in the elasticsearch.yml file:
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
---------------------------------------------------
|
|
|
-node.ingest: false
|
|
|
---------------------------------------------------
|
|
|
-
|
|
|
-It is possible to enable ingest on any node or have dedicated ingest nodes.
|
|
|
-
|
|
|
-In order to pre-process document before indexing the `pipeline` parameter should be used
|
|
|
-on an index or bulk request to tell Ingest what pipeline is going to be used.
|
|
|
-
|
|
|
-[source,js]
|
|
|
---------------------------------------------------
|
|
|
-PUT /my-index/my-type/my-id?pipeline=my_pipeline_id
|
|
|
-{
|
|
|
- ...
|
|
|
-}
|
|
|
---------------------------------------------------
|
|
|
-// AUTOSENSE
|
|
|
-
|
|
|
-=== Pipeline Definition
|
|
|
+[[pipe-line]]
|
|
|
+== Pipeline Definition
|
|
|
|
|
|
A pipeline is a definition of a series of processors that are to be
|
|
|
executed in the same sequential order as they are declared.
|
|
@@ -45,7 +17,7 @@ what the pipeline attempts to achieve.
|
|
|
The `processors` parameter defines a list of processors to be executed in
|
|
|
order.
|
|
|
|
|
|
-=== Processors
|
|
|
+== Processors
|
|
|
|
|
|
All processors are defined in the following way within a pipeline definition:
|
|
|
|
|
@@ -67,7 +39,7 @@ but is very useful for bookkeeping and tracing errors to specific processors.
|
|
|
|
|
|
See <<handling-failure-in-pipelines>> to learn more about the `on_failure` field and error handling in pipelines.
|
|
|
|
|
|
-==== Set processor
|
|
|
+=== Set processor
|
|
|
Sets one field and associates it with the specified value. If the field already exists,
|
|
|
its value will be replaced with the provided one.
|
|
|
|
|
@@ -90,7 +62,7 @@ its value will be replaced with the provided one.
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Append processor
|
|
|
+=== Append processor
|
|
|
Appends one or more values to an existing array if the field already exists and it is an array.
|
|
|
Converts a scalar to an array and appends one or more values to it if the field exists and it is a scalar.
|
|
|
Creates an array containing the provided values if the fields doesn't exist.
|
|
@@ -115,7 +87,7 @@ Accepts a single value or an array of values.
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Remove processor
|
|
|
+=== Remove processor
|
|
|
Removes an existing field. If the field doesn't exist, an exception will be thrown
|
|
|
|
|
|
[[remove-options]]
|
|
@@ -135,7 +107,7 @@ Removes an existing field. If the field doesn't exist, an exception will be thro
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Rename processor
|
|
|
+=== Rename processor
|
|
|
Renames an existing field. If the field doesn't exist, an exception will be thrown. Also, the new field
|
|
|
name must not exist.
|
|
|
|
|
@@ -159,7 +131,7 @@ name must not exist.
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
|
-==== Convert processor
|
|
|
+=== Convert processor
|
|
|
Converts an existing field's value to a different type, like turning a string to an integer.
|
|
|
If the field value is an array, all members will be converted.
|
|
|
|
|
@@ -187,7 +159,7 @@ false if its string value is equal to `false` (ignore case) and it will throw ex
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Gsub processor
|
|
|
+=== Gsub processor
|
|
|
Converts a string field by applying a regular expression and a replacement.
|
|
|
If the field is not a string, the processor will throw an exception.
|
|
|
|
|
@@ -212,7 +184,7 @@ If the field is not a string, the processor will throw an exception.
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Join processor
|
|
|
+=== Join processor
|
|
|
Joins each element of an array into a single string using a separator character between each element.
|
|
|
Throws error when the field is not an array.
|
|
|
|
|
@@ -235,7 +207,7 @@ Throws error when the field is not an array.
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Split processor
|
|
|
+=== Split processor
|
|
|
Split a field to an array using a separator character. Only works on string fields.
|
|
|
|
|
|
[[split-options]]
|
|
@@ -255,7 +227,7 @@ Split a field to an array using a separator character. Only works on string fiel
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Lowercase processor
|
|
|
+=== Lowercase processor
|
|
|
Converts a string to its lowercase equivalent.
|
|
|
|
|
|
[[lowercase-options]]
|
|
@@ -275,7 +247,7 @@ Converts a string to its lowercase equivalent.
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Uppercase processor
|
|
|
+=== Uppercase processor
|
|
|
Converts a string to its uppercase equivalent.
|
|
|
|
|
|
[[uppercase-options]]
|
|
@@ -295,7 +267,7 @@ Converts a string to its uppercase equivalent.
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Trim processor
|
|
|
+=== Trim processor
|
|
|
Trims whitespace from field. NOTE: this only works on leading and trailing whitespaces.
|
|
|
|
|
|
[[trim-options]]
|
|
@@ -315,7 +287,7 @@ Trims whitespace from field. NOTE: this only works on leading and trailing white
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Grok Processor
|
|
|
+=== Grok Processor
|
|
|
|
|
|
The Grok Processor extracts structured fields out of a single text field within a document. You choose which field to
|
|
|
extract matched fields from, as well as the Grok Pattern you expect will match. A Grok Pattern is like a regular
|
|
@@ -330,7 +302,7 @@ Here, you can add your own custom grok pattern files with custom grok expression
|
|
|
If you need help building patterns to match your logs, you will find the <http://grokdebug.herokuapp.com> and
|
|
|
<http://grokconstructor.appspot.com/> applications quite useful!
|
|
|
|
|
|
-===== Grok Basics
|
|
|
+==== Grok Basics
|
|
|
|
|
|
Grok sits on top of regular expressions, so any regular expressions are valid in grok as well.
|
|
|
The regular expression library is Oniguruma, and you can see the full supported regexp syntax
|
|
@@ -367,7 +339,7 @@ Grok expression.
|
|
|
%{NUMBER:duration} %{IP:client}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-===== Custom Patterns and Pattern Files
|
|
|
+==== Custom Patterns and Pattern Files
|
|
|
|
|
|
The Grok Processor comes pre-packaged with a base set of pattern files. These patterns may not always have
|
|
|
what you are looking for. These pattern files have a very basic format. Each line describes a named pattern with
|
|
@@ -393,7 +365,7 @@ SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
|
|
|
TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-===== Using Grok Processor in a Pipeline
|
|
|
+==== Using Grok Processor in a Pipeline
|
|
|
|
|
|
[[grok-options]]
|
|
|
.Grok Options
|
|
@@ -417,7 +389,7 @@ a document.
|
|
|
|
|
|
The pattern for this could be
|
|
|
|
|
|
-[source]
|
|
|
+[source,js]
|
|
|
--------------------------------------------------
|
|
|
%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
|
|
|
--------------------------------------------------
|
|
@@ -474,7 +446,7 @@ An example of a pipeline specifying custom pattern definitions:
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Date processor
|
|
|
+=== Date processor
|
|
|
|
|
|
The date processor is used for parsing dates from fields, and then using that date or timestamp as the timestamp for that document.
|
|
|
The date processor adds by default the parsed date as a new field called `@timestamp`, configurable by setting the `target_field`
|
|
@@ -512,7 +484,7 @@ An example that adds the parsed date to the `timestamp` field based on the `init
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Fail processor
|
|
|
+=== Fail processor
|
|
|
The Fail Processor is used to raise an exception. This is useful for when
|
|
|
a user expects a pipeline to fail and wishes to relay a specific message
|
|
|
to the requester.
|
|
@@ -534,7 +506,7 @@ to the requester.
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Foreach processor
|
|
|
+=== Foreach processor
|
|
|
All processors can operate on elements inside an array, but if all elements of an array need to
|
|
|
be processed in the same way defining a processor for each element becomes cumbersome and tricky
|
|
|
because it is likely that the number of elements in an array are unknown. For this reason the `foreach`
|
|
@@ -680,7 +652,7 @@ In this example if the `remove` processor does fail then
|
|
|
the array elements that have been processed thus far will
|
|
|
be updated.
|
|
|
|
|
|
-=== Accessing data in pipelines
|
|
|
+== Accessing data in pipelines
|
|
|
|
|
|
Processors in pipelines have read and write access to documents that pass through the pipeline.
|
|
|
The fields in the source of a document and its metadata fields are accessible.
|
|
@@ -781,7 +753,8 @@ to depends on the field in the source with name `geoip.country_iso_code`.
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-==== Handling Failure in Pipelines
|
|
|
+[[handling-failure-in-pipelines]]
|
|
|
+=== Handling Failure in Pipelines
|
|
|
|
|
|
In its simplest case, pipelines describe a list of processors which
|
|
|
are executed sequentially and processing halts at the first exception. This
|
|
@@ -845,7 +818,7 @@ the index for which failed documents get sent.
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
|
-===== Accessing Error Metadata From Processors Handling Exceptions
|
|
|
+==== Accessing Error Metadata From Processors Handling Exceptions
|
|
|
|
|
|
Sometimes you may want to retrieve the actual error message that was thrown
|
|
|
by a failed processor. To do so you can access metadata fields called
|
|
@@ -878,9 +851,9 @@ of manually setting it.
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
|
-=== Ingest APIs
|
|
|
+== Ingest APIs
|
|
|
|
|
|
-==== Put pipeline API
|
|
|
+=== Put pipeline API
|
|
|
|
|
|
The put pipeline api adds pipelines and updates existing pipelines in the cluster.
|
|
|
|
|
@@ -904,7 +877,7 @@ PUT _ingest/pipeline/my-pipeline-id
|
|
|
NOTE: The put pipeline api also instructs all ingest nodes to reload their in-memory representation of pipelines, so that
|
|
|
pipeline changes take immediately in effect.
|
|
|
|
|
|
-==== Get pipeline API
|
|
|
+=== Get pipeline API
|
|
|
|
|
|
The get pipeline api returns pipelines based on id. This api always returns a local reference of the pipeline.
|
|
|
|
|
@@ -940,7 +913,7 @@ For each returned pipeline the source and the version is returned.
|
|
|
The version is useful for knowing what version of the pipeline the node has.
|
|
|
Multiple ids can be provided at the same time. Also wildcards are supported.
|
|
|
|
|
|
-==== Delete pipeline API
|
|
|
+=== Delete pipeline API
|
|
|
|
|
|
The delete pipeline api deletes pipelines by id.
|
|
|
|
|
@@ -950,7 +923,7 @@ DELETE _ingest/pipeline/my-pipeline-id
|
|
|
--------------------------------------------------
|
|
|
// AUTOSENSE
|
|
|
|
|
|
-==== Simulate pipeline API
|
|
|
+=== Simulate pipeline API
|
|
|
|
|
|
The simulate pipeline api executes a specific pipeline against
|
|
|
the set of documents provided in the body of the request.
|