[role="xpack"] [[get-trained-model-deployment-stats]] = Get trained model deployment statistics API [subs="attributes"] ++++ Get trained model deployment stats ++++ Retrieves usage information for trained model deployments. [[ml-get-trained-model-deployment-stats-request]] == {api-request-title} `GET _ml/trained_models//deployment/_stats` + `GET _ml/trained_models/,/deployment/_stats` + `GET _ml/trained_models/,/deployment/_stats` [[ml-get-trained-model-deployment-stats-prereq]] == {api-prereq-title} Requires the `monitor_ml` cluster privilege. This privilege is included in the `machine_learning_user` built-in role. [[ml-get-trained-model-deployment-stats-desc]] == {api-description-title} You can get deployment information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression. [[ml-get-trained-model-deployment-stats-path-params]] == {api-path-parms-title} ``:: (Optional, string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id] [[ml-get-trained-model-deployment-stats-query-params]] == {api-query-parms-title} `allow_no_match`:: (Optional, Boolean) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=allow-no-match-models] [role="child_attributes"] [[ml-get-trained-model-deployment-stats-results]] == {api-response-body-title} `count`:: (integer) The total number of deployment statistics that matched the requested ID patterns. `deployment_stats`:: (array) An array of trained model deployment statistics, which are sorted by the `model_id` value in ascending order. + .Properties of trained model deployment stats [%collapsible%open] ==== `model_id`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id] `model_size`::: (<>) The size of the loaded model in bytes. `start_time`::: (long) The epoch timestamp when the deployment started. `state`::: (string) The overall state of the deployment. The values may be: + -- * `starting`: The deployment has recently started but is not yet usable as the model is not allocated on any nodes. * `started`: The deployment is usable as at least one node has the model allocated. * `stopping`: The deployment is preparing to stop and un-allocate the model from the relevant nodes. -- `allocation_status`::: (object) The detailed allocation status given the deployment configuration. + .Properties of allocation stats [%collapsible%open] ===== `allocation_count`::: (integer) The current number of nodes where the model is allocated. `target_allocation_count`::: (integer) The desired number of nodes for model allocation. `state`::: (string) The detailed allocation state related to the nodes. + -- * `starting`: Allocations are being attempted but no node currently has the model allocated. * `started`: At least one node has the model allocated. * `fully_allocated`: The deployment is fully allocated and satisfies the `target_allocation_count`. -- ===== `nodes`::: (array of objects) The deployment stats for each node that currently has the model allocated. + .Properties of node stats [%collapsible%open] ===== `average_inference_time_ms`::: (double) The average time for each inference call to complete on this node. `inference_count`::: (integer) The total number of inference calls made against this node for this model. `last_access`::: (long) The epoch time stamp of the last inference call for the model on this node. `node`::: (object) Information pertaining to the node. + .Properties of node [%collapsible%open] ====== `attributes`::: (object) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-attributes] `ephemeral_id`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id] `id`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-id] `name`::: (string) The node name. `transport_address`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-transport-address] ====== `routing_state`::: (object) The current routing state and reason for the current routing state for this allocation. + -- * `starting`: The model is attempting to allocate on this model, inference calls are not yet accepted. * `started`: The model is allocated and ready to accept inference requests. * `stopping`: The model is being de-allocated from this node. * `stopped`: The model is fully de-allocated from this node. * `failed`: The allocation attempt failed, see `reason` field for the potential cause. -- `reason`::: (string) The reason for the current state. Usually only populated when the `routing_state` is `failed`. `start_time`::: (long) The epoch timestamp when the allocation started. ===== ==== [[ml-get-trained-model-deployment-stats-response-codes]] == {api-response-codes-title} `404` (Missing resources):: If `allow_no_match` is `false`, this code indicates that there are no resources that match the request or only partial matches for the request. [[ml-get-trained-model-deployment-stats-example]] == {api-examples-title} The following example gets deployment information for all currently started model deployments: [source,console] -------------------------------------------------- GET _ml/trained_models/*/deployment/_stats -------------------------------------------------- // TEST[skip:TBD] The API returns the following results: [source,console-result] ---- { "count": 2, "deployment_stats": [ { "model_id": "elastic__distilbert-base-uncased-finetuned-conll03-english", "model_size": "253.3mb", "state": "started", "allocation_status": { "allocation_count": 1, "target_allocation_count": 1, "state": "fully_allocated" }, "nodes": [ { "node": { "6pzZQ9OmQUWAaswMlwVEwg": { "name": "runTask-0", "ephemeral_id": "aI1OwkPMRCiAJ_1XkEAqdw", "transport_address": "127.0.0.1:9300", "attributes": { "ml.machine_memory": "68719476736", "xpack.installed": "true", "testattr": "test", "ml.max_open_jobs": "512", "ml.max_jvm_size": "4181590016" }, "roles": [ "data", "data_cold", "data_content", "data_frozen", "data_hot", "data_warm", "ingest", "master", "ml", "remote_cluster_client", "transform" ] } }, "routing_state": { "routing_state": "started" }, "inference_count": 9, "average_inference_time_ms": 51, "last_access": 1632855681069 } ] }, { "model_id": "typeform__distilbert-base-uncased-mnli", "model_size": "255.5mb", "state": "started", "allocation_status": { "allocation_count": 1, "target_allocation_count": 1, "state": "fully_allocated" }, "nodes": [ { "node": { "6pzZQ9OmQUWAaswMlwVEwg": { "name": "runTask-0", "ephemeral_id": "aI1OwkPMRCiAJ_1XkEAqdw", "transport_address": "127.0.0.1:9300", "attributes": { "ml.machine_memory": "68719476736", "xpack.installed": "true", "testattr": "test", "ml.max_open_jobs": "512", "ml.max_jvm_size": "4181590016" }, "roles": [ "data", "data_cold", "data_content", "data_frozen", "data_hot", "data_warm", "ingest", "master", "ml", "remote_cluster_client", "transform" ] } }, "routing_state": { "routing_state": "started" }, "inference_count": 0, "average_inference_time_ms": 0 } ] } ] } ---- // NOTCONSOLE