infer-trained-model-deployment.asciidoc 5.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231
  1. [role="xpack"]
  2. [[infer-trained-model-deployment]]
  3. = Infer trained model deployment API
  4. [subs="attributes"]
  5. ++++
  6. <titleabbrev>Infer trained model deployment</titleabbrev>
  7. ++++
  8. Evaluates a trained model.
  9. [[infer-trained-model-deployment-request]]
  10. == {api-request-title}
  11. `POST _ml/trained_models/<model_id>/deployment/_infer`
  12. ////
  13. [[infer-trained-model-deployment-prereq]]
  14. == {api-prereq-title}
  15. ////
  16. ////
  17. [[infer-trained-model-deployment-desc]]
  18. == {api-description-title}
  19. ////
  20. [[infer-trained-model-deployment-path-params]]
  21. == {api-path-parms-title}
  22. `<model_id>`::
  23. (Required, string)
  24. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
  25. [[infer-trained-model-deployment-query-params]]
  26. == {api-query-parms-title}
  27. `timeout`::
  28. (Optional, time)
  29. Controls the amount of time to wait for {infer} results. Defaults to 10 seconds.
  30. [[infer-trained-model-request-body]]
  31. == {api-request-body-title}
  32. `docs`::
  33. (Required, array)
  34. An array of objects to pass to the model for inference. The objects should
  35. contain a field matching your configured trained model input. Typically, the field
  36. name is `text_field`. Currently, only a single value is allowed.
  37. ////
  38. [[infer-trained-model-deployment-results]]
  39. == {api-response-body-title}
  40. ////
  41. ////
  42. [[ml-get-trained-models-response-codes]]
  43. == {api-response-codes-title}
  44. ////
  45. [[infer-trained-model-deployment-example]]
  46. == {api-examples-title}
  47. The response depends on the task the model is trained for. If it is a
  48. text classification task, the response is the score. For example:
  49. [source,console]
  50. --------------------------------------------------
  51. POST _ml/trained_models/model2/deployment/_infer
  52. {
  53. "docs": [{"text_field": "The movie was awesome!!"}]
  54. }
  55. --------------------------------------------------
  56. // TEST[skip:TBD]
  57. The API returns the predicted label and the confidence.
  58. [source,console-result]
  59. ----
  60. {
  61. "predicted_value" : "POSITIVE",
  62. "prediction_probability" : 0.9998667964092964
  63. }
  64. ----
  65. // NOTCONSOLE
  66. For named entity recognition (NER) tasks, the response contains the annotated
  67. text output and the recognized entities.
  68. [source,console]
  69. --------------------------------------------------
  70. POST _ml/trained_models/model2/deployment/_infer
  71. {
  72. "docs": [{"text_field": "Hi my name is Josh and I live in Berlin"}]
  73. }
  74. --------------------------------------------------
  75. // TEST[skip:TBD]
  76. The API returns in this case:
  77. [source,console-result]
  78. ----
  79. {
  80. "predicted_value" : "Hi my name is [Josh](PER&Josh) and I live in [Berlin](LOC&Berlin)",
  81. "entities" : [
  82. {
  83. "entity" : "Josh",
  84. "class_name" : "PER",
  85. "class_probability" : 0.9977303419824,
  86. "start_pos" : 14,
  87. "end_pos" : 18
  88. },
  89. {
  90. "entity" : "Berlin",
  91. "class_name" : "LOC",
  92. "class_probability" : 0.9992474323902818,
  93. "start_pos" : 33,
  94. "end_pos" : 39
  95. }
  96. ]
  97. }
  98. ----
  99. // NOTCONSOLE
  100. Zero-shot classification tasks require extra configuration defining the class labels.
  101. These labels are passed in the zero-shot inference config.
  102. [source,console]
  103. --------------------------------------------------
  104. POST _ml/trained_models/model2/deployment/_infer
  105. {
  106. "docs": [
  107. {
  108. "text_field": "This is a very happy person"
  109. }
  110. ],
  111. "inference_config": {
  112. "zero_shot_classification": {
  113. "labels": [
  114. "glad",
  115. "sad",
  116. "bad",
  117. "rad"
  118. ],
  119. "multi_label": false
  120. }
  121. }
  122. }
  123. --------------------------------------------------
  124. // TEST[skip:TBD]
  125. The API returns the predicted label and the confidence, as well as the top classes:
  126. [source,console-result]
  127. ----
  128. {
  129. "predicted_value" : "glad",
  130. "top_classes" : [
  131. {
  132. "class_name" : "glad",
  133. "class_probability" : 0.8061155063386439,
  134. "class_score" : 0.8061155063386439
  135. },
  136. {
  137. "class_name" : "rad",
  138. "class_probability" : 0.18218006158387956,
  139. "class_score" : 0.18218006158387956
  140. },
  141. {
  142. "class_name" : "bad",
  143. "class_probability" : 0.006325615787634201,
  144. "class_score" : 0.006325615787634201
  145. },
  146. {
  147. "class_name" : "sad",
  148. "class_probability" : 0.0053788162898424545,
  149. "class_score" : 0.0053788162898424545
  150. }
  151. ],
  152. "prediction_probability" : 0.8061155063386439
  153. }
  154. ----
  155. // NOTCONSOLE
  156. The tokenization truncate option can be overridden when calling the API:
  157. [source,console]
  158. --------------------------------------------------
  159. POST _ml/trained_models/model2/deployment/_infer
  160. {
  161. "docs": [{"text_field": "The Amazon rainforest covers most of the Amazon basin in South America"}],
  162. "inference_config": {
  163. "ner": {
  164. "tokenization": {
  165. "bert": {
  166. "truncate": "first"
  167. }
  168. }
  169. }
  170. }
  171. }
  172. --------------------------------------------------
  173. // TEST[skip:TBD]
  174. When the input has been truncated due to the limit imposed by the model's `max_sequence_length`
  175. the `is_truncated` field appears in the response.
  176. [source,console-result]
  177. ----
  178. {
  179. "predicted_value" : "The [Amazon](LOC&Amazon) rainforest covers most of the [Amazon](LOC&Amazon) basin in [South America](LOC&South+America)",
  180. "entities" : [
  181. {
  182. "entity" : "Amazon",
  183. "class_name" : "LOC",
  184. "class_probability" : 0.9505460915724254,
  185. "start_pos" : 4,
  186. "end_pos" : 10
  187. },
  188. {
  189. "entity" : "Amazon",
  190. "class_name" : "LOC",
  191. "class_probability" : 0.9969992804311777,
  192. "start_pos" : 41,
  193. "end_pos" : 47
  194. }
  195. ],
  196. "is_truncated" : true
  197. }
  198. ----
  199. // NOTCONSOLE