Pārlūkot izejas kodu

[DOCS] Adds section about how to use ingest timestamp to sync a transform (#87650)

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
István Zoltán Szabó 3 gadi atpakaļ
vecāks
revīzija
d48e1a2488

+ 1 - 1
docs/reference/transform/apis/put-transform.asciidoc

@@ -271,7 +271,7 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=sync-time-delay]
 include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=sync-time-field]
 +
 --
-TIP: In general, it’s a good idea to use a field that contains the
+TIP: It is strongly recommended to use a field that contains the
 <<access-ingest-metadata,ingest timestamp>>. If you use a different field,
 you might need to set the `delay` such that it accounts for data transmission
 delays.

+ 51 - 2
docs/reference/transform/checkpoints.asciidoc

@@ -40,8 +40,8 @@ The {transform} applies changes related to either new or changed entities or
 time buckets to the destination index. The set of changes can be paginated. The
 {transform} performs a composite aggregation similarly to the batch {transform} 
 operation, however it also injects query filters based on the previous step to 
-reduce the amount of work. After all changes have been applied, the checkpoint is 
-complete.
+reduce the amount of work. After all changes have been applied, the checkpoint 
+is complete.
 --
 
 This checkpoint process involves both search and indexing activity on the
@@ -55,6 +55,55 @@ TIP: If the cluster experiences unsuitable performance degradation due to the
 {transform}, stop the {transform} and refer to <<transform-performance>>.
 
 
+[discrete]
+[[sync-field-ingest-timestamp]]
+== Using the ingest timestamp for syncing the {transform}
+
+
+
+In most cases, it is strongly recommended to use the ingest timestamp of the 
+source indices for syncing the {transform}. This is the most optimal way for 
+{transforms} to be able to identify new changes. If your data source follows the 
+{ecs-ref}/ecs-reference.html[ECS standard], you might already have an 
+{ecs-ref}/ecs-event.html#field-event-ingested[`event.ingested`] field. In this 
+case, use `event.ingested` as the `sync`.`time`.`field` property of your 
+{transform}.
+
+If you don't have a `event.ingested` field or it isn't populated, you can set it 
+by using an ingest pipeline. Create an ingest pipeline either using the 
+<<put-pipeline-api, ingest pipeline API>> (like the example below) or via {kib} 
+under **Stack Management > Ingest Pipelines**. Use a 
+<<set-processor,`set` processor>> to set the field and associate it with the 
+value of the ingest timestamp.
+
+[source,console]
+----------------------------------
+PUT _ingest/pipeline/set_ingest_time
+{
+  "description": "Set ingest timestamp.",
+  "processors": [
+    {
+      "set": {
+        "field": "event.ingested",
+        "value": "{{{_ingest.timestamp}}}"
+      }
+    }
+  ]
+}
+----------------------------------
+
+After you created the ingest pipeline, apply it to the source indices of your 
+{transform}. The pipeline adds the field `event.ingested` to every document with 
+the value of the ingest timestamp. Configure the `sync`.`time`.`field` property 
+of your {transform} to use the field by using the 
+<<put-transform,create {transform} API>> for new {transforms} or the 
+<<update-transform, update {transform} API>> for existing {transforms}. The 
+`event.ingested` field is used for syncing the {transform}. 
+
+Refer to <<add-pipeline-to-indexing-request>> and <<ingest>> to learn more about 
+how to use an ingest pipeline.
+
+
 [discrete]
 [[ml-transform-checkpoint-heuristics]]
 == Change detection heuristics