service-openai.asciidoc 5.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187
  1. [[infer-service-openai]]
  2. === OpenAI {infer} integration
  3. .New API reference
  4. [sidebar]
  5. --
  6. For the most up-to-date API details, refer to {api-es}/group/endpoint-inference[{infer-cap} APIs].
  7. --
  8. Creates an {infer} endpoint to perform an {infer} task with the `openai` service or `openai` compatible APIs.
  9. [discrete]
  10. [[infer-service-openai-api-request]]
  11. ==== {api-request-title}
  12. `PUT /_inference/<task_type>/<inference_id>`
  13. [discrete]
  14. [[infer-service-openai-api-path-params]]
  15. ==== {api-path-parms-title}
  16. `<inference_id>`::
  17. (Required, string)
  18. include::inference-shared.asciidoc[tag=inference-id]
  19. `<task_type>`::
  20. (Required, string)
  21. include::inference-shared.asciidoc[tag=task-type]
  22. +
  23. --
  24. Available task types:
  25. * `chat_completion`,
  26. * `completion`,
  27. * `text_embedding`.
  28. --
  29. [NOTE]
  30. ====
  31. The `chat_completion` task type only supports streaming and only through the `_stream` API.
  32. include::inference-shared.asciidoc[tag=chat-completion-docs]
  33. ====
  34. [discrete]
  35. [[infer-service-openai-api-request-body]]
  36. ==== {api-request-body-title}
  37. `chunking_settings`::
  38. (Optional, object)
  39. include::inference-shared.asciidoc[tag=chunking-settings]
  40. `max_chunk_size`:::
  41. (Optional, integer)
  42. include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
  43. `overlap`:::
  44. (Optional, integer)
  45. include::inference-shared.asciidoc[tag=chunking-settings-overlap]
  46. `sentence_overlap`:::
  47. (Optional, integer)
  48. include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
  49. `strategy`:::
  50. (Optional, string)
  51. include::inference-shared.asciidoc[tag=chunking-settings-strategy]
  52. `service`::
  53. (Required, string)
  54. The type of service supported for the specified task type. In this case,
  55. `openai`.
  56. `service_settings`::
  57. (Required, object)
  58. include::inference-shared.asciidoc[tag=service-settings]
  59. +
  60. --
  61. These settings are specific to the `openai` service.
  62. --
  63. `api_key`:::
  64. (Required, string)
  65. A valid API key of your OpenAI account.
  66. You can find your OpenAI API keys in your OpenAI account under the
  67. https://platform.openai.com/api-keys[API keys section].
  68. +
  69. --
  70. include::inference-shared.asciidoc[tag=api-key-admonition]
  71. --
  72. `dimensions`:::
  73. (Optional, integer)
  74. The number of dimensions the resulting output embeddings should have.
  75. Only supported in `text-embedding-3` and later models.
  76. If not set the OpenAI defined default for the model is used.
  77. `model_id`:::
  78. (Required, string)
  79. The name of the model to use for the {infer} task.
  80. Refer to the
  81. https://platform.openai.com/docs/guides/embeddings/what-are-embeddings[OpenAI documentation]
  82. for the list of available text embedding models.
  83. `organization_id`:::
  84. (Optional, string)
  85. The unique identifier of your organization.
  86. You can find the Organization ID in your OpenAI account under
  87. https://platform.openai.com/account/organization[**Settings** > **Organizations**].
  88. `url`:::
  89. (Optional, string)
  90. The URL endpoint to use for the requests.
  91. Can be changed for testing purposes.
  92. Defaults to `https://api.openai.com/v1/embeddings`.
  93. `rate_limit`:::
  94. (Optional, object)
  95. The `openai` service sets a default number of requests allowed per minute depending on the task type.
  96. For `text_embedding` it is set to `3000`.
  97. For `completion` it is set to `500`.
  98. This helps to minimize the number of rate limit errors returned from OpenAI.
  99. To modify this, set the `requests_per_minute` setting of this object in your service settings:
  100. +
  101. --
  102. include::inference-shared.asciidoc[tag=request-per-minute-example]
  103. More information about the rate limits for OpenAI can be found in your https://platform.openai.com/account/limits[Account limits].
  104. --
  105. `task_settings`::
  106. (Optional, object)
  107. include::inference-shared.asciidoc[tag=task-settings]
  108. +
  109. .`task_settings` for the `completion` task type
  110. [%collapsible%closed]
  111. =====
  112. `user`:::
  113. (Optional, string)
  114. Specifies the user issuing the request, which can be used for abuse detection.
  115. =====
  116. +
  117. .`task_settings` for the `text_embedding` task type
  118. [%collapsible%closed]
  119. =====
  120. `user`:::
  121. (optional, string)
  122. Specifies the user issuing the request, which can be used for abuse detection.
  123. =====
  124. [discrete]
  125. [[inference-example-openai]]
  126. ==== OpenAI service example
  127. The following example shows how to create an {infer} endpoint called `openai-embeddings` to perform a `text_embedding` task type.
  128. The embeddings created by requests to this endpoint will have 128 dimensions.
  129. [source,console]
  130. ------------------------------------------------------------
  131. PUT _inference/text_embedding/openai-embeddings
  132. {
  133. "service": "openai",
  134. "service_settings": {
  135. "api_key": "<api_key>",
  136. "model_id": "text-embedding-3-small",
  137. "dimensions": 128
  138. }
  139. }
  140. ------------------------------------------------------------
  141. // TEST[skip:TBD]
  142. The next example shows how to create an {infer} endpoint called `openai-completion` to perform a `completion` task type.
  143. [source,console]
  144. ------------------------------------------------------------
  145. PUT _inference/completion/openai-completion
  146. {
  147. "service": "openai",
  148. "service_settings": {
  149. "api_key": "<api_key>",
  150. "model_id": "gpt-3.5-turbo"
  151. }
  152. }
  153. ------------------------------------------------------------
  154. // TEST[skip:TBD]