123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231 |
- [role="xpack"]
- [[infer-trained-model-deployment]]
- = Infer trained model deployment API
- [subs="attributes"]
- ++++
- <titleabbrev>Infer trained model deployment</titleabbrev>
- ++++
- Evaluates a trained model.
- [[infer-trained-model-deployment-request]]
- == {api-request-title}
- `POST _ml/trained_models/<model_id>/deployment/_infer`
- ////
- [[infer-trained-model-deployment-prereq]]
- == {api-prereq-title}
- ////
- ////
- [[infer-trained-model-deployment-desc]]
- == {api-description-title}
- ////
- [[infer-trained-model-deployment-path-params]]
- == {api-path-parms-title}
- `<model_id>`::
- (Required, string)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
- [[infer-trained-model-deployment-query-params]]
- == {api-query-parms-title}
- `timeout`::
- (Optional, time)
- Controls the amount of time to wait for {infer} results. Defaults to 10 seconds.
- [[infer-trained-model-request-body]]
- == {api-request-body-title}
- `docs`::
- (Required, array)
- An array of objects to pass to the model for inference. The objects should
- contain a field matching your configured trained model input. Typically, the field
- name is `text_field`. Currently, only a single value is allowed.
- ////
- [[infer-trained-model-deployment-results]]
- == {api-response-body-title}
- ////
- ////
- [[ml-get-trained-models-response-codes]]
- == {api-response-codes-title}
- ////
- [[infer-trained-model-deployment-example]]
- == {api-examples-title}
- The response depends on the task the model is trained for. If it is a
- text classification task, the response is the score. For example:
- [source,console]
- --------------------------------------------------
- POST _ml/trained_models/model2/deployment/_infer
- {
- "docs": [{"text_field": "The movie was awesome!!"}]
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- The API returns the predicted label and the confidence.
- [source,console-result]
- ----
- {
- "predicted_value" : "POSITIVE",
- "prediction_probability" : 0.9998667964092964
- }
- ----
- // NOTCONSOLE
- For named entity recognition (NER) tasks, the response contains the annotated
- text output and the recognized entities.
- [source,console]
- --------------------------------------------------
- POST _ml/trained_models/model2/deployment/_infer
- {
- "docs": [{"text_field": "Hi my name is Josh and I live in Berlin"}]
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- The API returns in this case:
- [source,console-result]
- ----
- {
- "predicted_value" : "Hi my name is [Josh](PER&Josh) and I live in [Berlin](LOC&Berlin)",
- "entities" : [
- {
- "entity" : "Josh",
- "class_name" : "PER",
- "class_probability" : 0.9977303419824,
- "start_pos" : 14,
- "end_pos" : 18
- },
- {
- "entity" : "Berlin",
- "class_name" : "LOC",
- "class_probability" : 0.9992474323902818,
- "start_pos" : 33,
- "end_pos" : 39
- }
- ]
- }
- ----
- // NOTCONSOLE
- Zero-shot classification tasks require extra configuration defining the class labels.
- These labels are passed in the zero-shot inference config.
- [source,console]
- --------------------------------------------------
- POST _ml/trained_models/model2/deployment/_infer
- {
- "docs": [
- {
- "text_field": "This is a very happy person"
- }
- ],
- "inference_config": {
- "zero_shot_classification": {
- "labels": [
- "glad",
- "sad",
- "bad",
- "rad"
- ],
- "multi_label": false
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- The API returns the predicted label and the confidence, as well as the top classes:
- [source,console-result]
- ----
- {
- "predicted_value" : "glad",
- "top_classes" : [
- {
- "class_name" : "glad",
- "class_probability" : 0.8061155063386439,
- "class_score" : 0.8061155063386439
- },
- {
- "class_name" : "rad",
- "class_probability" : 0.18218006158387956,
- "class_score" : 0.18218006158387956
- },
- {
- "class_name" : "bad",
- "class_probability" : 0.006325615787634201,
- "class_score" : 0.006325615787634201
- },
- {
- "class_name" : "sad",
- "class_probability" : 0.0053788162898424545,
- "class_score" : 0.0053788162898424545
- }
- ],
- "prediction_probability" : 0.8061155063386439
- }
- ----
- // NOTCONSOLE
- The tokenization truncate option can be overridden when calling the API:
- [source,console]
- --------------------------------------------------
- POST _ml/trained_models/model2/deployment/_infer
- {
- "docs": [{"text_field": "The Amazon rainforest covers most of the Amazon basin in South America"}],
- "inference_config": {
- "ner": {
- "tokenization": {
- "bert": {
- "truncate": "first"
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- When the input has been truncated due to the limit imposed by the model's `max_sequence_length`
- the `is_truncated` field appears in the response.
- [source,console-result]
- ----
- {
- "predicted_value" : "The [Amazon](LOC&Amazon) rainforest covers most of the [Amazon](LOC&Amazon) basin in [South America](LOC&South+America)",
- "entities" : [
- {
- "entity" : "Amazon",
- "class_name" : "LOC",
- "class_probability" : 0.9505460915724254,
- "start_pos" : 4,
- "end_pos" : 10
- },
- {
- "entity" : "Amazon",
- "class_name" : "LOC",
- "class_probability" : 0.9969992804311777,
- "start_pos" : 41,
- "end_pos" : 47
- }
- ],
- "is_truncated" : true
- }
- ----
- // NOTCONSOLE
|