Browse Source

[Inference API] Add Google AI Studio completion docs (#109089)

Tim Grein 1 year ago
parent
commit
6d864154ca

+ 1 - 1
docs/reference/inference/delete-inference.asciidoc

@@ -7,7 +7,7 @@ experimental[]
 Deletes an {infer} endpoint.
 Deletes an {infer} endpoint.
 
 
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure or
+{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio or
 Hugging Face. For built-in models and models uploaded though Eland, the {infer}
 Hugging Face. For built-in models and models uploaded though Eland, the {infer}
 APIs offer an alternative way to use and manage trained models. However, if you
 APIs offer an alternative way to use and manage trained models. However, if you
 do not plan to use the {infer} APIs to use these models or if you want to use
 do not plan to use the {infer} APIs to use these models or if you want to use

+ 1 - 1
docs/reference/inference/get-inference.asciidoc

@@ -7,7 +7,7 @@ experimental[]
 Retrieves {infer} endpoint information.
 Retrieves {infer} endpoint information.
 
 
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure or
+{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio or
 Hugging Face. For built-in models and models uploaded though Eland, the {infer}
 Hugging Face. For built-in models and models uploaded though Eland, the {infer}
 APIs offer an alternative way to use and manage trained models. However, if you
 APIs offer an alternative way to use and manage trained models. However, if you
 do not plan to use the {infer} APIs to use these models or if you want to use
 do not plan to use the {infer} APIs to use these models or if you want to use

+ 1 - 1
docs/reference/inference/inference-apis.asciidoc

@@ -5,7 +5,7 @@
 experimental[]
 experimental[]
 
 
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure or
+{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio or
 Hugging Face. For built-in models and models uploaded though Eland, the {infer}
 Hugging Face. For built-in models and models uploaded though Eland, the {infer}
 APIs offer an alternative way to use and manage trained models. However, if you
 APIs offer an alternative way to use and manage trained models. However, if you
 do not plan to use the {infer} APIs to use these models or if you want to use
 do not plan to use the {infer} APIs to use these models or if you want to use

+ 1 - 1
docs/reference/inference/post-inference.asciidoc

@@ -7,7 +7,7 @@ experimental[]
 Performs an inference task on an input text by using an {infer} endpoint.
 Performs an inference task on an input text by using an {infer} endpoint.
 
 
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure or
+{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio or
 Hugging Face. For built-in models and models uploaded though Eland, the {infer}
 Hugging Face. For built-in models and models uploaded though Eland, the {infer}
 APIs offer an alternative way to use and manage trained models. However, if you
 APIs offer an alternative way to use and manage trained models. However, if you
 do not plan to use the {infer} APIs to use these models or if you want to use
 do not plan to use the {infer} APIs to use these models or if you want to use

+ 50 - 2
docs/reference/inference/put-inference.asciidoc

@@ -8,7 +8,7 @@ Creates an {infer} endpoint to perform an {infer} task.
 
 
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
 IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
 {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure
 {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure
-OpenAI or Hugging Face. For built-in models and models uploaded though
+OpenAI, Google AI Studio or Hugging Face. For built-in models and models uploaded though
 Eland, the {infer} APIs offer an alternative way to use and manage trained
 Eland, the {infer} APIs offer an alternative way to use and manage trained
 models. However, if you do not plan to use the {infer} APIs to use these models
 models. However, if you do not plan to use the {infer} APIs to use these models
 or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
 or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
@@ -45,6 +45,7 @@ The following services are available through the {infer} API:
 * Azure OpenAI
 * Azure OpenAI
 * Azure AI Studio
 * Azure AI Studio
 * Elasticsearch (for built-in models and models uploaded through Eland)
 * Elasticsearch (for built-in models and models uploaded through Eland)
+* Google AI Studio
 
 
 
 
 [discrete]
 [discrete]
@@ -84,6 +85,7 @@ OpenAI service.
 * `azureaistudio`: specify the `completion` or `text_embedding` task type to use the Azure AI Studio service.
 * `azureaistudio`: specify the `completion` or `text_embedding` task type to use the Azure AI Studio service.
 * `elasticsearch`: specify the `text_embedding` task type to use the E5
 * `elasticsearch`: specify the `text_embedding` task type to use the E5
 built-in model or text embedding models uploaded by Eland.
 built-in model or text embedding models uploaded by Eland.
+* `googleaistudio`: specify the `completion` task to use the Google AI Studio service.
 
 
 `service_settings`::
 `service_settings`::
 (Required, object)
 (Required, object)
@@ -282,6 +284,33 @@ To modify this, set the `requests_per_minute` setting of this object in your ser
 ```
 ```
 =====
 =====
 +
 +
+.`service_settings` for the `googleiastudio` service
+[%collapsible%closed]
+=====
+`api_key`:::
+(Required, string)
+A valid API key for the Google Gemini API.
+
+`model_id`:::
+(Required, string)
+The name of the model to use for the {infer} task.
+You can find the supported models at https://ai.google.dev/gemini-api/docs/models/gemini[Gemini API models].
+
+`rate_limit`:::
+(Optional, object)
+By default, the `googleaistudio` service sets the number of requests allowed per minute to `360`.
+This helps to minimize the number of rate limit errors returned from Google AI Studio.
+To modify this, set the `requests_per_minute` setting of this object in your service settings:
++
+--
+```
+"rate_limit": {
+    "requests_per_minute": <<number_of_requests>>
+}
+```
+--
+=====
++
 .`service_settings` for the `elasticsearch` service
 .`service_settings` for the `elasticsearch` service
 [%collapsible%closed]
 [%collapsible%closed]
 =====
 =====
@@ -304,7 +333,6 @@ exceed the number of available processors per node divided by the number of
 allocations. Must be a power of 2. Max allowed value is 32.
 allocations. Must be a power of 2. Max allowed value is 32.
 =====
 =====
 
 
-
 `task_settings`::
 `task_settings`::
 (Optional, object)
 (Optional, object)
 Settings to configure the {infer} task. These settings are specific to the
 Settings to configure the {infer} task. These settings are specific to the
@@ -701,3 +729,23 @@ PUT _inference/completion/azure_ai_studio_completion
 // TEST[skip:TBD]
 // TEST[skip:TBD]
 
 
 The list of chat completion models that you can choose from in your deployment can be found in the https://ai.azure.com/explore/models?selectedTask=chat-completion[Azure AI Studio model explorer].
 The list of chat completion models that you can choose from in your deployment can be found in the https://ai.azure.com/explore/models?selectedTask=chat-completion[Azure AI Studio model explorer].
+
+[discrete]
+[[inference-example-googleaistudio]]
+===== Google AI Studio service
+
+The following example shows how to create an {infer} endpoint called
+`google_ai_studio_completion` to perform a `completion` task type.
+
+[source,console]
+------------------------------------------------------------
+PUT _inference/completion/google_ai_studio_completion
+{
+    "service": "googleaistudio",
+    "service_settings": {
+        "api_key": "<api_key>>",
+        "model_id": "<model_id>"
+    }
+}
+------------------------------------------------------------
+// TEST[skip:TBD]