|
|
@@ -7,8 +7,8 @@ experimental[]
|
|
|
Creates an {infer} endpoint to perform an {infer} task.
|
|
|
|
|
|
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
|
|
|
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, or
|
|
|
-Hugging Face. For built-in models and models uploaded though
|
|
|
+{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure
|
|
|
+OpenAI or Hugging Face. For built-in models and models uploaded though
|
|
|
Eland, the {infer} APIs offer an alternative way to use and manage trained
|
|
|
models. However, if you do not plan to use the {infer} APIs to use these models
|
|
|
or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
|
|
|
@@ -42,6 +42,7 @@ The following services are available through the {infer} API:
|
|
|
* ELSER
|
|
|
* Hugging Face
|
|
|
* OpenAI
|
|
|
+* Azure OpenAI
|
|
|
* Elasticsearch (for built-in models and models uploaded through Eland)
|
|
|
|
|
|
|
|
|
@@ -78,6 +79,7 @@ Cohere service.
|
|
|
service.
|
|
|
* `openai`: specify the `completion` or `text_embedding` task type to use the
|
|
|
OpenAI service.
|
|
|
+* `azureopenai`: specify the `text_embedding` task type to use the Azure OpenAI service.
|
|
|
* `elasticsearch`: specify the `text_embedding` task type to use the E5
|
|
|
built-in model or text embedding models uploaded by Eland.
|
|
|
|
|
|
@@ -187,6 +189,41 @@ https://platform.openai.com/account/organization[**Settings** > **Organizations*
|
|
|
(Optional, string)
|
|
|
The URL endpoint to use for the requests. Can be changed for testing purposes.
|
|
|
Defaults to `https://api.openai.com/v1/embeddings`.
|
|
|
+
|
|
|
+=====
|
|
|
++
|
|
|
+.`service_settings` for the `azureopenai` service
|
|
|
+[%collapsible%closed]
|
|
|
+=====
|
|
|
+
|
|
|
+`api_key` or `entra_id`:::
|
|
|
+(Required, string)
|
|
|
+You must provide _either_ an API key or an Entra ID.
|
|
|
+If you do not provide either, or provide both, you will receive an error when trying to create your model.
|
|
|
+See the https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#authentication[Azure OpenAI Authentication documentation] for more details on these authentication types.
|
|
|
+
|
|
|
+IMPORTANT: You need to provide the API key or Entra ID only once, during the {infer} model creation.
|
|
|
+The <<get-inference-api>> does not retrieve your authentication credentials.
|
|
|
+After creating the {infer} model, you cannot change the associated API key or Entra ID.
|
|
|
+If you want to use a different API key or Entra ID, delete the {infer} model and recreate it with the same name and the updated API key.
|
|
|
+You _must_ have either an `api_key` or an `entra_id` defined.
|
|
|
+If neither are present, an error will occur.
|
|
|
+
|
|
|
+`resource_name`:::
|
|
|
+(Required, string)
|
|
|
+The name of your Azure OpenAI resource.
|
|
|
+You can find this from the https://portal.azure.com/#view/HubsExtension/BrowseAll[list of resources] in the Azure Portal for your subscription.
|
|
|
+
|
|
|
+`deployment_id`:::
|
|
|
+(Required, string)
|
|
|
+The deployment name of your deployed models.
|
|
|
+Your Azure OpenAI deployments can be found though the https://oai.azure.com/[Azure OpenAI Studio] portal that is linked to your subscription.
|
|
|
+
|
|
|
+`api_version`:::
|
|
|
+(Required, string)
|
|
|
+The Azure API version ID to use.
|
|
|
+We recommend using the https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#embeddings[latest supported non-preview version].
|
|
|
+
|
|
|
=====
|
|
|
+
|
|
|
.`service_settings` for the `elasticsearch` service
|
|
|
@@ -266,8 +303,17 @@ maximum token length. Defaults to `END`. Valid values are:
|
|
|
|
|
|
`user`:::
|
|
|
(optional, string)
|
|
|
-For `openai` service only. Specifies the user issuing the request, which can be
|
|
|
-used for abuse detection.
|
|
|
+For `openai` and `azureopenai` service only. Specifies the user issuing the
|
|
|
+request, which can be used for abuse detection.
|
|
|
+
|
|
|
+=====
|
|
|
++
|
|
|
+.`task_settings` for the `completion` task type
|
|
|
+[%collapsible%closed]
|
|
|
+=====
|
|
|
+`user`:::
|
|
|
+(optional, string)
|
|
|
+For `openai` service only. Specifies the user issuing the request, which can be used for abuse detection.
|
|
|
=====
|
|
|
|
|
|
|
|
|
@@ -491,3 +537,28 @@ PUT _inference/completion/openai-completion
|
|
|
}
|
|
|
------------------------------------------------------------
|
|
|
// TEST[skip:TBD]
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[inference-example-azureopenai]]
|
|
|
+===== Azure OpenAI service
|
|
|
+
|
|
|
+The following example shows how to create an {infer} endpoint called
|
|
|
+`azure_openai_embeddings` to perform a `text_embedding` task type.
|
|
|
+Note that we do not specify a model here, as it is defined already via our Azure OpenAI deployment.
|
|
|
+
|
|
|
+The list of embeddings models that you can choose from in your deployment can be found in the https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#embeddings[Azure models documentation].
|
|
|
+
|
|
|
+[source,console]
|
|
|
+------------------------------------------------------------
|
|
|
+PUT _inference/text_embedding/azure_openai_embeddings
|
|
|
+{
|
|
|
+ "service": "azureopenai",
|
|
|
+ "service_settings": {
|
|
|
+ "api_key": "<api_key>",
|
|
|
+ "resource_name": "<resource_name>",
|
|
|
+ "deployment_id": "<deployment_id>",
|
|
|
+ "api_version": "2024-02-01"
|
|
|
+ }
|
|
|
+}
|
|
|
+------------------------------------------------------------
|
|
|
+// TEST[skip:TBD]
|