put-inference.asciidoc 6.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250
  1. [role="xpack"]
  2. [[put-inference-api]]
  3. === Create {infer} API
  4. experimental[]
  5. Creates a model to perform an {infer} task.
  6. IMPORTANT: The {infer} APIs enable you to use certain services, such as ELSER,
  7. OpenAI, or Hugging Face, in your cluster. This is not the same feature that you
  8. can use on an ML node with custom {ml} models. If you want to train and use your
  9. own model, use the <<ml-df-trained-models-apis>>.
  10. [discrete]
  11. [[put-inference-api-request]]
  12. ==== {api-request-title}
  13. `PUT /_inference/<task_type>/<model_id>`
  14. [discrete]
  15. [[put-inference-api-prereqs]]
  16. ==== {api-prereq-title}
  17. * Requires the `manage` <<privileges-list-cluster,cluster privilege>>.
  18. [discrete]
  19. [[put-inference-api-desc]]
  20. ==== {api-description-title}
  21. The create {infer} API enables you to create and configure an {infer} model to
  22. perform a specific {infer} task.
  23. The following services are available through the {infer} API:
  24. * ELSER
  25. * OpenAI
  26. * Hugging Face
  27. [discrete]
  28. [[put-inference-api-path-params]]
  29. ==== {api-path-parms-title}
  30. `<model_id>`::
  31. (Required, string)
  32. The unique identifier of the model.
  33. `<task_type>`::
  34. (Required, string)
  35. The type of the {infer} task that the model will perform. Available task types:
  36. * `sparse_embedding`,
  37. * `text_embedding`.
  38. [discrete]
  39. [[put-inference-api-request-body]]
  40. == {api-request-body-title}
  41. `service`::
  42. (Required, string)
  43. The type of service supported for the specified task type.
  44. Available services:
  45. * `elser`: specify the `sparse_embedding` task type to use the ELSER service.
  46. * `openai`: specify the `text_embedding` task type to use the OpenAI service.
  47. * `hugging_face`: specify the `text_embedding` task type to use the Hugging Face
  48. service.
  49. `service_settings`::
  50. (Required, object)
  51. Settings used to install the {infer} model. These settings are specific to the
  52. `service` you specified.
  53. +
  54. .`service_settings` for `elser`
  55. [%collapsible%closed]
  56. =====
  57. `num_allocations`:::
  58. (Required, integer)
  59. The number of model allocations to create.
  60. `num_threads`:::
  61. (Required, integer)
  62. The number of threads to use by each model allocation.
  63. =====
  64. +
  65. .`service_settings` for `openai`
  66. [%collapsible%closed]
  67. =====
  68. `api_key`:::
  69. (Required, string)
  70. A valid API key of your OpenAI account. You can find your OpenAI API keys in
  71. your OpenAI account under the
  72. https://platform.openai.com/api-keys[API keys section].
  73. IMPORTANT: You need to provide the API key only once, during the {infer} model
  74. creation. The <<get-inference-api>> does not retrieve your API key. After
  75. creating the {infer} model, you cannot change the associated API key. If you
  76. want to use a different API key, delete the {infer} model and recreate it with
  77. the same name and the updated API key.
  78. `organization_id`:::
  79. (Optional, string)
  80. The unique identifier of your organization. You can find the Organization ID in
  81. your OpenAI account under
  82. https://platform.openai.com/account/organization[**Settings** > **Organizations**].
  83. `url`:::
  84. (Optional, string)
  85. The URL endpoint to use for the requests. Can be changed for testing purposes.
  86. Defaults to `https://api.openai.com/v1/embeddings`.
  87. =====
  88. +
  89. .`service_settings` for `hugging_face`
  90. [%collapsible%closed]
  91. =====
  92. `api_key`:::
  93. (Required, string)
  94. A valid access token of your Hugging Face account. You can find your Hugging
  95. Face access tokens or you can create a new one
  96. https://huggingface.co/settings/tokens[on the settings page].
  97. IMPORTANT: You need to provide the API key only once, during the {infer} model
  98. creation. The <<get-inference-api>> does not retrieve your API key. After
  99. creating the {infer} model, you cannot change the associated API key. If you
  100. want to use a different API key, delete the {infer} model and recreate it with
  101. the same name and the updated API key.
  102. `url`:::
  103. (Required, string)
  104. The URL endpoint to use for the requests.
  105. =====
  106. `task_settings`::
  107. (Optional, object)
  108. Settings to configure the {infer} task. These settings are specific to the
  109. `<task_type>` you specified.
  110. +
  111. .`task_settings` for `text_embedding`
  112. [%collapsible%closed]
  113. =====
  114. `model`:::
  115. (Optional, string)
  116. The name of the model to use for the {infer} task. Refer to the
  117. https://platform.openai.com/docs/guides/embeddings/what-are-embeddings[OpenAI documentation]
  118. for the list of available text embedding models.
  119. =====
  120. [discrete]
  121. [[put-inference-api-example]]
  122. ==== {api-examples-title}
  123. This section contains example API calls for every service type.
  124. [discrete]
  125. [[inference-example-elser]]
  126. ===== ELSER service
  127. The following example shows how to create an {infer} model called
  128. `my-elser-model` to perform a `sparse_embedding` task type.
  129. [source,console]
  130. ------------------------------------------------------------
  131. PUT _inference/sparse_embedding/my-elser-model
  132. {
  133. "service": "elser",
  134. "service_settings": {
  135. "num_allocations": 1,
  136. "num_threads": 1
  137. },
  138. "task_settings": {}
  139. }
  140. ------------------------------------------------------------
  141. // TEST[skip:TBD]
  142. Example response:
  143. [source,console-result]
  144. ------------------------------------------------------------
  145. {
  146. "model_id": "my-elser-model",
  147. "task_type": "sparse_embedding",
  148. "service": "elser",
  149. "service_settings": {
  150. "num_allocations": 1,
  151. "num_threads": 1
  152. },
  153. "task_settings": {}
  154. }
  155. ------------------------------------------------------------
  156. // NOTCONSOLE
  157. [discrete]
  158. [[inference-example-openai]]
  159. ===== OpenAI service
  160. The following example shows how to create an {infer} model called
  161. `openai_embeddings` to perform a `text_embedding` task type.
  162. [source,console]
  163. ------------------------------------------------------------
  164. PUT _inference/text_embedding/openai_embeddings
  165. {
  166. "service": "openai",
  167. "service_settings": {
  168. "api_key": "<api_key>"
  169. },
  170. "task_settings": {
  171. "model": "text-embedding-ada-002"
  172. }
  173. }
  174. ------------------------------------------------------------
  175. // TEST[skip:TBD]
  176. [discrete]
  177. [[inference-example-hugging-face]]
  178. ===== Hugging Face service
  179. The following example shows how to create an {infer} model called
  180. `hugging-face_embeddings` to perform a `text_embedding` task type.
  181. [source,console]
  182. ------------------------------------------------------------
  183. PUT _inference/text_embedding/hugging-face-embeddings
  184. {
  185. "service": "hugging_face",
  186. "service_settings": {
  187. "api_key": "<access_token>", <1>
  188. "url": "<url_endpoint>" <2>
  189. }
  190. }
  191. ------------------------------------------------------------
  192. // TEST[skip:TBD]
  193. <1> A valid Hugging Face access token. You can find on the
  194. https://huggingface.co/settings/tokens[settings page of your account].
  195. <2> The {infer} endpoint URL you created on Hugging Face.
  196. Create a new {infer} endpoint on
  197. https://ui.endpoints.huggingface.co/[the Hugging Face endpoint page] to get an
  198. endpoint URL. Select the model you want to use on the new endpoint creation page
  199. - for example `intfloat/e5-small-v2` - then select the `Sentence Embeddings`
  200. task under the Advanced configuration section. Create the endpoint. Copy the URL
  201. after the endpoint initialization has been finished.