put-inference.asciidoc 2.9 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
  1. [role="xpack"]
  2. [[put-inference-api]]
  3. === Create {infer} API
  4. Creates an {infer} endpoint to perform an {infer} task.
  5. [IMPORTANT]
  6. ====
  7. * The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
  8. * For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
  9. ====
  10. [discrete]
  11. [[put-inference-api-request]]
  12. ==== {api-request-title}
  13. `PUT /_inference/<task_type>/<inference_id>`
  14. [discrete]
  15. [[put-inference-api-prereqs]]
  16. ==== {api-prereq-title}
  17. * Requires the `manage_inference` <<privileges-list-cluster,cluster privilege>>
  18. (the built-in `inference_admin` role grants this privilege)
  19. [discrete]
  20. [[put-inference-api-path-params]]
  21. ==== {api-path-parms-title}
  22. `<inference_id>`::
  23. (Required, string)
  24. include::inference-shared.asciidoc[tag=inference-id]
  25. `<task_type>`::
  26. (Required, string)
  27. include::inference-shared.asciidoc[tag=task-type]
  28. +
  29. --
  30. Refer to the service list in the <<put-inference-api-desc,API description section>> for the available task types.
  31. --
  32. [discrete]
  33. [[put-inference-api-desc]]
  34. ==== {api-description-title}
  35. The create {infer} API enables you to create an {infer} endpoint and configure a {ml} model to perform a specific {infer} task.
  36. The following services are available through the {infer} API.
  37. You can find the available task types next to the service name.
  38. Click the links to review the configuration details of the services:
  39. * <<infer-service-alibabacloud-ai-search,AlibabaCloud AI Search>> (`completion`, `rerank`, `sparse_embedding`, `text_embedding`)
  40. * <<infer-service-amazon-bedrock,Amazon Bedrock>> (`completion`, `text_embedding`)
  41. * <<infer-service-anthropic,Anthropic>> (`completion`)
  42. * <<infer-service-azure-ai-studio,Azure AI Studio>> (`completion`, `text_embedding`)
  43. * <<infer-service-azure-openai,Azure OpenAI>> (`completion`, `text_embedding`)
  44. * <<infer-service-cohere,Cohere>> (`completion`, `rerank`, `text_embedding`)
  45. * <<infer-service-elasticsearch,Elasticsearch>> (`rerank`, `sparse_embedding`, `text_embedding` - this service is for built-in models and models uploaded through Eland)
  46. * <<infer-service-elser,ELSER>> (`sparse_embedding`)
  47. * <<infer-service-google-ai-studio,Google AI Studio>> (`completion`, `text_embedding`)
  48. * <<infer-service-google-vertex-ai,Google Vertex AI>> (`rerank`, `text_embedding`)
  49. * <<infer-service-hugging-face,Hugging Face>> (`text_embedding`)
  50. * <<infer-service-mistral,Mistral>> (`text_embedding`)
  51. * <<infer-service-openai,OpenAI>> (`completion`, `text_embedding`)
  52. * <<infer-service-watsonx-ai>> (`text_embedding`)
  53. The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of
  54. the services connect to external providers.