|
@@ -4,7 +4,15 @@
|
|
|
<titleabbrev>Processor reference</titleabbrev>
|
|
|
++++
|
|
|
|
|
|
-{es} includes several configurable processors. To get a list of available
|
|
|
+An <<ingest,ingest pipeline>> is made up of a sequence of processors that are applied to documents as they are ingested into an index.
|
|
|
+Each processor performs a specific task, such as filtering, transforming, or enriching data.
|
|
|
+
|
|
|
+Each successive processor depends on the output of the previous processor, so the order of processors is important.
|
|
|
+The modified documents are indexed into {es} after all processors are applied.
|
|
|
+
|
|
|
+{es} includes over 40 configurable processors.
|
|
|
+The subpages in this section contain reference documentation for each processor.
|
|
|
+To get a list of available
|
|
|
processors, use the <<cluster-nodes-info,nodes info>> API.
|
|
|
|
|
|
[source,console]
|
|
@@ -12,11 +20,191 @@ processors, use the <<cluster-nodes-info,nodes info>> API.
|
|
|
GET _nodes/ingest?filter_path=nodes.*.ingest.processors
|
|
|
----
|
|
|
|
|
|
-The pages in this section contain reference documentation for each processor.
|
|
|
+[discrete]
|
|
|
+[[ingest-processors-categories]]
|
|
|
+=== Ingest processors by category
|
|
|
+
|
|
|
+We've categorized the available processors on this page and summarized their functions.
|
|
|
+This will help you find the right processor for your use case.
|
|
|
+
|
|
|
+* <<ingest-process-category-data-enrichment>>
|
|
|
+* <<ingest-process-category-data-transformation>>
|
|
|
+* <<ingest-process-category-data-filtering>>
|
|
|
+* <<ingest-process-category-pipeline-handling>>
|
|
|
+* <<ingest-process-category-array-json-handling>>
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-data-enrichment]]
|
|
|
+=== Data enrichment processors
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-data-enrichment-general]]
|
|
|
+==== General outcomes
|
|
|
+
|
|
|
+<<append-processor, `append` processor>>::
|
|
|
+Appends a value to a field.
|
|
|
+
|
|
|
+<<date-index-name-processor, `date_index_name` processor>>::
|
|
|
+Points documents to the right time-based index based on a date or timestamp field.
|
|
|
+
|
|
|
+<<enrich-processor, `enrich` processor>>::
|
|
|
+Enriches documents with data from another index.
|
|
|
+[TIP]
|
|
|
+====
|
|
|
+Refer to <<ingest-enriching-data, Enrich your data>> for detailed examples of how to use the `enrich` processor to add data from your existing indices to incoming documents during ingest.
|
|
|
+====
|
|
|
+
|
|
|
+<<inference-processor, `inference` processor>>::
|
|
|
+Uses {ml} to classify and tag text fields.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-data-enrichment-specific]]
|
|
|
+==== Specific outcomes
|
|
|
+
|
|
|
+<<attachment, `attachment` processor>>::
|
|
|
+Parses and indexes binary data, such as PDFs and Word documents.
|
|
|
+
|
|
|
+<<ingest-circle-processor, `circle` processor>>::
|
|
|
+Converts a location field to a Geo-Point field.
|
|
|
+
|
|
|
+<<community-id-processor, `community_id` processor>>::
|
|
|
+Computes the Community ID for network flow data.
|
|
|
+
|
|
|
+<<fingerprint-processor, `fingerprint` processor>>::
|
|
|
+Computes a hash of the document’s content.
|
|
|
+
|
|
|
+<<ingest-geo-grid-processor, `geo_grid` processor>>::
|
|
|
+Converts geo-grid definitions of grid tiles or cells to regular bounding boxes or polygons which describe their shape.
|
|
|
+
|
|
|
+<<geoip-processor, `geoip` processor>>::
|
|
|
+Adds information about the geographical location of an IPv4 or IPv6 address.
|
|
|
+
|
|
|
+<<network-direction-processor, `network_direction` processor>>::
|
|
|
+Calculates the network direction given a source IP address, destination IP address, and a list of internal networks.
|
|
|
+
|
|
|
+<<registered-domain-processor, `registered_domain` processor>>::
|
|
|
+Extracts the registered domain (also known as the effective top-level domain or eTLD), sub-domain, and top-level domain from a fully qualified domain name (FQDN).
|
|
|
+
|
|
|
+<<ingest-node-set-security-user-processor, `set_security_user` processor>>::
|
|
|
+Sets user-related details (such as `username`, `roles`, `email`, `full_name`,`metadata`, `api_key`, `realm` and `authentication_type`) from the current authenticated user to the current document by pre-processing the ingest.
|
|
|
+
|
|
|
+<<uri-parts-processor, `uri_parts` processor>>::
|
|
|
+Parses a Uniform Resource Identifier (URI) string and extracts its components as an object.
|
|
|
+
|
|
|
+<<urldecode-processor, `urldecode` processor>>::
|
|
|
+URL-decodes a string.
|
|
|
+
|
|
|
+<<user-agent-processor, `user_agent` processor>>::
|
|
|
+Parses user-agent strings to extract information about web clients.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-data-transformation]]
|
|
|
+=== Data transformation processors
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-data-transformation-general]]
|
|
|
+==== General outcomes
|
|
|
+
|
|
|
+<<convert-processor, `convert` processor>>::
|
|
|
+Converts a field in the currently ingested document to a different type, such as converting a string to an integer.
|
|
|
+
|
|
|
+<<dissect-processor, `dissect` processor>>::
|
|
|
+Extracts structured fields out of a single text field within a document.
|
|
|
+Unlike the <<grok-processor,grok processor>>, dissect does not use regular expressions.
|
|
|
+This makes the dissect's a simpler and often faster alternative.
|
|
|
+
|
|
|
+<<grok-processor, `grok` processor>>::
|
|
|
+Extracts structured fields out of a single text field within a document, using the <<grok, Grok>> regular expression dialect that supports reusable aliased expressions.
|
|
|
+
|
|
|
+<<gsub-processor, `gsub` processor>>::
|
|
|
+Converts a string field by applying a regular expression and a replacement.
|
|
|
+
|
|
|
+<<redact-processor, `redact` processor>>::
|
|
|
+Uses the <<grok, Grok>> rules engine to obscure text in the input document matching the given Grok patterns.
|
|
|
+
|
|
|
+<<rename-processor, `rename` processor>>::
|
|
|
+Renames an existing field.
|
|
|
+
|
|
|
+<<set-processor, `set` processor>>::
|
|
|
+Sets a value on a field.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-data-transformation-specific]]
|
|
|
+==== Specific outcomes
|
|
|
+
|
|
|
+<<bytes-processor, `bytes` processor>>::
|
|
|
+Converts a human-readable byte value to its value in bytes (for example `1kb` becomes `1024`).
|
|
|
+
|
|
|
+<<csv-processor, `csv` processor>>::
|
|
|
+Extracts a single line of CSV data from a text field.
|
|
|
+
|
|
|
+<<date-processor, `date` processor>>::
|
|
|
+Extracts and converts date fields.
|
|
|
+
|
|
|
+<<dot-expand-processor, `dot_expand`>> processor::
|
|
|
+Expands a field with dots into an object field.
|
|
|
+
|
|
|
+<<htmlstrip-processor, `html_strip` processor>>::
|
|
|
+Removes HTML tags from a field.
|
|
|
+
|
|
|
+<<join-processor, `join` processor>>::
|
|
|
+Joins each element of an array into a single string using a separator character between each element.
|
|
|
+
|
|
|
+<<kv-processor, `kv` processor>>::
|
|
|
+Parse messages (or specific event fields) containing key-value pairs.
|
|
|
+
|
|
|
+<<lowercase-processor, `lowercase` processor>> and <<uppercase-processor, `uppercase` processor>>::
|
|
|
+Converts a string field to lowercase or uppercase.
|
|
|
+
|
|
|
+<<split-processor, `split` processor>>::
|
|
|
+Splits a field into an array of values.
|
|
|
+
|
|
|
+<<trim-processor, `trim` processor>>::
|
|
|
+Trims whitespace from field.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-data-filtering]]
|
|
|
+=== Data filtering processors
|
|
|
+
|
|
|
+<<drop-processor, `drop` processor>>::
|
|
|
+Drops the document without raising any errors.
|
|
|
+
|
|
|
+<<remove-processor, `remove` processor>>::
|
|
|
+Removes fields from documents.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-pipeline-handling]]
|
|
|
+=== Pipeline handling processors
|
|
|
+
|
|
|
+<<fail-processor, `fail` processor>>::
|
|
|
+Raises an exception. Useful for when you expect a pipeline to fail and want to relay a specific message to the requester.
|
|
|
+
|
|
|
+<<pipeline-processor, `pipeline` processor>>::
|
|
|
+Executes another pipeline.
|
|
|
+
|
|
|
+<<reroute-processor, `reroute` processor>>::
|
|
|
+Reroutes documents to another target index or data stream.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[ingest-process-category-array-json-handling]]
|
|
|
+=== Array/JSON handling processors
|
|
|
+
|
|
|
+<<foreach-processor, `for_each` processor>>::
|
|
|
+Runs an ingest processor on each element of an array or object.
|
|
|
+
|
|
|
+<<json-processor, `json` processor>>::
|
|
|
+Converts a JSON string into a structured JSON object.
|
|
|
+
|
|
|
+<<script-processor, `script` processor>>::
|
|
|
+Runs an inline or stored <<modules-scripting, script>> on incoming documents.
|
|
|
+The script runs in the {painless}/painless-ingest-processor-context.html[painless `ingest` context].
|
|
|
+
|
|
|
+<<sort-processor, `sort` processor>>::
|
|
|
+Sorts the elements of an array in ascending or descending order.
|
|
|
|
|
|
[discrete]
|
|
|
[[ingest-process-plugins]]
|
|
|
-=== Processor plugins
|
|
|
+=== Add additional processors
|
|
|
|
|
|
You can install additional processors as {plugins}/ingest.html[plugins].
|
|
|
|