Browse Source

[DOCS] [8.17] Adds new default inference endpoint information (#117985) (#118240)

* Adds new default inference information

* Update docs/reference/mapping/types/semantic-text.asciidoc



* Update docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc



* Update docs/reference/mapping/types/semantic-text.asciidoc



---------

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
Co-authored-by: David Kyle <david.kyle@elastic.co>
kosabogi 10 months ago
parent
commit
4f21c62ce4

+ 10 - 9
docs/reference/mapping/types/semantic-text.asciidoc

@@ -12,13 +12,14 @@ Long passages are <<auto-text-chunking, automatically chunked>> to smaller secti
 
 
 The `semantic_text` field type specifies an inference endpoint identifier that will be used to generate embeddings.
 The `semantic_text` field type specifies an inference endpoint identifier that will be used to generate embeddings.
 You can create the inference endpoint by using the <<put-inference-api>>.
 You can create the inference endpoint by using the <<put-inference-api>>.
-This field type and the <<query-dsl-semantic-query,`semantic` query>> type make it simpler to perform semantic search on your data.
-If you don't specify an inference endpoint, the <<infer-service-elser,ELSER service>> is used by default.
+This field type and the <<query-dsl-semantic-query,`semantic` query>> type make it simpler to perform semantic search on your data. 
+
+If you don’t specify an inference endpoint, the `inference_id` field defaults to `.elser-2-elasticsearch`, a preconfigured endpoint for the elasticsearch service.
 
 
 Using `semantic_text`, you won't need to specify how to generate embeddings for your data, or how to index it.
 Using `semantic_text`, you won't need to specify how to generate embeddings for your data, or how to index it.
 The {infer} endpoint automatically determines the embedding generation, indexing, and query to use.
 The {infer} endpoint automatically determines the embedding generation, indexing, and query to use.
 
 
-If you use the ELSER service, you can set up `semantic_text` with the following API request:
+If you use the preconfigured `.elser-2-elasticsearch` endpoint, you can set up `semantic_text` with the following API request:
 
 
 [source,console]
 [source,console]
 ------------------------------------------------------------
 ------------------------------------------------------------
@@ -34,7 +35,7 @@ PUT my-index-000001
 }
 }
 ------------------------------------------------------------
 ------------------------------------------------------------
 
 
-If you use a service other than ELSER, you must create an {infer} endpoint using the <<put-inference-api>> and reference it when setting up `semantic_text` as the following example demonstrates:
+To use a custom {infer} endpoint instead of the default `.elser-2-elasticsearch`, you must <<put-inference-api>> and specify its `inference_id` when setting up the `semantic_text` field type.
 
 
 [source,console]
 [source,console]
 ------------------------------------------------------------
 ------------------------------------------------------------
@@ -53,8 +54,7 @@ PUT my-index-000002
 // TEST[skip:Requires inference endpoint]
 // TEST[skip:Requires inference endpoint]
 <1> The `inference_id` of the {infer} endpoint to use to generate embeddings.
 <1> The `inference_id` of the {infer} endpoint to use to generate embeddings.
 
 
-
-The recommended way to use semantic_text is by having dedicated {infer} endpoints for ingestion and search.
+The recommended way to use `semantic_text` is by having dedicated {infer} endpoints for ingestion and search.
 This ensures that search speed remains unaffected by ingestion workloads, and vice versa.
 This ensures that search speed remains unaffected by ingestion workloads, and vice versa.
 After creating dedicated {infer} endpoints for both, you can reference them using the `inference_id` and `search_inference_id` parameters when setting up the index mapping for an index that uses the `semantic_text` field.
 After creating dedicated {infer} endpoints for both, you can reference them using the `inference_id` and `search_inference_id` parameters when setting up the index mapping for an index that uses the `semantic_text` field.
 
 
@@ -82,10 +82,11 @@ PUT my-index-000003
 
 
 `inference_id`::
 `inference_id`::
 (Required, string)
 (Required, string)
-{infer-cap} endpoint that will be used to generate the embeddings for the field.
+{infer-cap} endpoint that will be used to generate embeddings for the field.
+By default, `.elser-2-elasticsearch` is used.
 This parameter cannot be updated.
 This parameter cannot be updated.
 Use the <<put-inference-api>> to create the endpoint.
 Use the <<put-inference-api>> to create the endpoint.
-If `search_inference_id` is specified, the {infer} endpoint defined by `inference_id` will only be used at index time.
+If `search_inference_id` is specified, the {infer} endpoint will only be used at index time.
 
 
 `search_inference_id`::
 `search_inference_id`::
 (Optional, string)
 (Optional, string)
@@ -201,7 +202,7 @@ PUT test-index
         "properties": {
         "properties": {
             "infer_field": {
             "infer_field": {
                 "type": "semantic_text",
                 "type": "semantic_text",
-                "inference_id": "my-elser-endpoint"
+                "inference_id": ".elser-2-elasticsearch"
             },
             },
             "source_field": {
             "source_field": {
                 "type": "text",
                 "type": "text",

+ 4 - 4
docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc

@@ -14,15 +14,15 @@ You don't need to define model related settings and parameters, or create {infer
 The recommended way to use <<semantic-search,semantic search>> in the {stack} is following the `semantic_text` workflow.
 The recommended way to use <<semantic-search,semantic search>> in the {stack} is following the `semantic_text` workflow.
 When you need more control over indexing and query settings, you can still use the complete {infer} workflow (refer to  <<semantic-search-inference,this tutorial>> to review the process).
 When you need more control over indexing and query settings, you can still use the complete {infer} workflow (refer to  <<semantic-search-inference,this tutorial>> to review the process).
 
 
-This tutorial uses the <<inference-example-elser,`elser` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.
+This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.
 
 
 
 
 [discrete]
 [discrete]
 [[semantic-text-requirements]]
 [[semantic-text-requirements]]
 ==== Requirements
 ==== Requirements
 
 
-This tutorial uses the <<infer-service-elser,ELSER service>> for demonstration, which is created automatically as needed. 
-To use the `semantic_text` field type with an {infer} service other than ELSER, you must create an inference endpoint using the <<put-inference-api>>.
+This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, which is created automatically as needed. 
+To use the `semantic_text` field type with an {infer} service other than `elasticsearch` service, you must create an inference endpoint using the <<put-inference-api>>.
 
 
 
 
 [discrete]
 [discrete]
@@ -48,7 +48,7 @@ PUT semantic-embeddings
 // TEST[skip:TBD]
 // TEST[skip:TBD]
 <1> The name of the field to contain the generated embeddings.
 <1> The name of the field to contain the generated embeddings.
 <2> The field to contain the embeddings is a `semantic_text` field.
 <2> The field to contain the embeddings is a `semantic_text` field.
-Since no `inference_id` is provided, the <<infer-service-elser,ELSER service>> is used by default.
+Since no `inference_id` is provided, the default endpoint `.elser-2-elasticsearch` for the <<infer-service-elasticsearch,`elasticsearch` service>> is used.
 To use a different {infer} service, you must create an {infer} endpoint first using the <<put-inference-api>> and then specify it in the `semantic_text` field mapping using the `inference_id` parameter.
 To use a different {infer} service, you must create an {infer} endpoint first using the <<put-inference-api>> and then specify it in the `semantic_text` field mapping using the `inference_id` parameter.
 
 
 
 

+ 7 - 44
docs/reference/search/search-your-data/semantic-text-hybrid-search

@@ -8,47 +8,12 @@ This tutorial demonstrates how to perform hybrid search, combining semantic sear
 
 
 In hybrid search, semantic search retrieves results based on the meaning of the text, while full-text search focuses on exact word matches. By combining both methods, hybrid search delivers more relevant results, particularly in cases where relying on a single approach may not be sufficient.
 In hybrid search, semantic search retrieves results based on the meaning of the text, while full-text search focuses on exact word matches. By combining both methods, hybrid search delivers more relevant results, particularly in cases where relying on a single approach may not be sufficient.
 
 
-The recommended way to use hybrid search in the {stack} is following the `semantic_text` workflow. This tutorial uses the <<inference-example-elser,`elser` service>> for demonstration, but you can use any service and its supported models offered by the {infer-cap} API.
-
-[discrete]
-[[semantic-text-hybrid-infer-endpoint]]
-==== Create the {infer} endpoint
-
-Create an inference endpoint by using the <<put-inference-api>>:
-
-[source,console]
-------------------------------------------------------------
-PUT _inference/sparse_embedding/my-elser-endpoint <1>
-{
-  "service": "elser", <2>
-  "service_settings": {
-    "adaptive_allocations": { <3>
-      "enabled": true,
-      "min_number_of_allocations": 3,
-      "max_number_of_allocations": 10
-    },
-    "num_threads": 1
-  }
-}
-------------------------------------------------------------
-// TEST[skip:TBD]
-<1> The task type is `sparse_embedding` in the path as the `elser` service will
-be used and ELSER creates sparse vectors. The `inference_id` is
-`my-elser-endpoint`.
-<2> The `elser` service is used in this example.
-<3> This setting enables and configures adaptive allocations.
-Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.
-
-[NOTE]
-====
-You might see a 502 bad gateway error in the response when using the {kib} Console.
-This error usually just reflects a timeout, while the model downloads in the background.
-You can check the download progress in the {ml-app} UI.
-====
+The recommended way to use hybrid search in the {stack} is following the `semantic_text` workflow. 
+This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.
 
 
 [discrete]
 [discrete]
 [[hybrid-search-create-index-mapping]]
 [[hybrid-search-create-index-mapping]]
-==== Create an index mapping for hybrid search
+==== Create an index mapping
 
 
 The destination index will contain both the embeddings for semantic search and the original text field for full-text search. This structure enables the combination of semantic search and full-text search.
 The destination index will contain both the embeddings for semantic search and the original text field for full-text search. This structure enables the combination of semantic search and full-text search.
 
 
@@ -60,11 +25,10 @@ PUT semantic-embeddings
     "properties": {
     "properties": {
       "semantic_text": { <1>
       "semantic_text": { <1>
         "type": "semantic_text", 
         "type": "semantic_text", 
-        "inference_id": "my-elser-endpoint" <2>
       },
       },
-      "content": { <3>
+      "content": { <2>
         "type": "text",
         "type": "text",
-        "copy_to": "semantic_text" <4>
+        "copy_to": "semantic_text" <3>
       }
       }
     }
     }
   }
   }
@@ -72,9 +36,8 @@ PUT semantic-embeddings
 ------------------------------------------------------------
 ------------------------------------------------------------
 // TEST[skip:TBD]
 // TEST[skip:TBD]
 <1> The name of the field to contain the generated embeddings for semantic search.
 <1> The name of the field to contain the generated embeddings for semantic search.
-<2> The identifier of the inference endpoint that generates the embeddings based on the input text.
-<3> The name of the field to contain the original text for lexical search.
-<4> The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {infer} endpoint.
+<2> The name of the field to contain the original text for lexical search.
+<3> The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {infer} endpoint.
 
 
 [NOTE]
 [NOTE]
 ====
 ====