Browse Source

semantic_text: Updated copy-to docs (#110350)

Carlos Delgado 1 year ago
parent
commit
30b32b6a46
1 changed files with 39 additions and 5 deletions
  1. 39 5
      docs/reference/mapping/types/semantic-text.asciidoc

+ 39 - 5
docs/reference/mapping/types/semantic-text.asciidoc

@@ -8,7 +8,7 @@
 beta[]
 
 The `semantic_text` field type automatically generates embeddings for text
-content using an inference endpoint. 
+content using an inference endpoint.
 
 The `semantic_text` field type specifies an inference endpoint identifier that will be used to generate embeddings.
 You can create the inference endpoint by using the <<put-inference-api>>.
@@ -24,7 +24,7 @@ PUT my-index-000001
 {
   "mappings": {
     "properties": {
-      "inference_field": { 
+      "inference_field": {
         "type": "semantic_text",
         "inference_id": "my-elser-endpoint"
       }
@@ -40,7 +40,7 @@ PUT my-index-000001
 ==== Parameters for `semantic_text` fields
 
 `inference_id`::
-(Required, string)  
+(Required, string)
 Inference endpoint that will be used to generate the embeddings for the field.
 Use the <<put-inference-api>> to create the endpoint.
 
@@ -137,8 +137,42 @@ field to collect the values of other fields for semantic search. Each value has
 its embeddings calculated separately; each field value is a separate set of chunk(s) in
 the resulting embeddings.
 
-This imposes a restriction on bulk updates to documents with `semantic_text`. 
-In bulk requests, all fields that are copied to a `semantic_text` field must have a value to ensure every embedding is calculated correctly.
+This imposes a restriction on bulk requests and ingestion pipelines that update documents with `semantic_text` fields.
+In these cases, all fields that are copied to a `semantic_text` field, including the `semantic_text` field value, must have a value to ensure every embedding is calculated correctly.
+
+For example, the following mapping:
+
+[source,console]
+------------------------------------------------------------
+PUT test-index
+{
+    "mappings": {
+        "properties": {
+            "infer_field": {
+                "type": "semantic_text",
+                "inference_id": "my-elser-endpoint"
+            },
+            "source_field": {
+                "type": "text",
+                "copy_to": "infer_field"
+            }
+        }
+    }
+}
+------------------------------------------------------------
+// TEST[skip:TBD]
+
+Will need the following bulk update request to ensure that `infer_field` is updated correctly:
+
+[source,console]
+------------------------------------------------------------
+PUT test-index/_bulk
+{"update": {"_id": "1"}}
+{"doc": {"infer_field": "updated inference field", "source_field": "updated source field"}}
+------------------------------------------------------------
+// TEST[skip:TBD]
+
+Notice that both the `semantic_text` field and the source field are updated in the bulk request.
 
 [discrete]
 [[limitations]]