123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472 |
- [role="xpack"]
- [[get-trained-models-stats]]
- = Get trained models statistics API
- [subs="attributes"]
- ++++
- <titleabbrev>Get trained models stats</titleabbrev>
- ++++
- Retrieves usage information for trained models.
- [[ml-get-trained-models-stats-request]]
- == {api-request-title}
- `GET _ml/trained_models/_stats` +
- `GET _ml/trained_models/_all/_stats` +
- `GET _ml/trained_models/<model_id_or_deployment_id>/_stats` +
- `GET _ml/trained_models/<model_id_or_deployment_id>,<model_id_2_or_deployment_id_2>/_stats` +
- `GET _ml/trained_models/<model_id_pattern*_or_deployment_id_pattern*>,<model_id_2_or_deployment_id_2>/_stats`
- [[ml-get-trained-models-stats-prereq]]
- == {api-prereq-title}
- Requires the `monitor_ml` cluster privilege. This privilege is included in the
- `machine_learning_user` built-in role.
- [[ml-get-trained-models-stats-desc]]
- == {api-description-title}
- You can get usage information for multiple trained models or trained model
- deployments in a single API request by using a comma-separated list of model
- IDs, deployment IDs, or a wildcard expression.
- [[ml-get-trained-models-stats-path-params]]
- == {api-path-parms-title}
- `<model_id_or_deployment_id>`::
- (Optional, string)
- The unique identifier of the model or the deployment. If a model has multiple
- deployments, and the ID of one of the deployments matches the model ID, then the
- model ID takes precedence; the results are returned for all deployments of the
- model.
- [[ml-get-trained-models-stats-query-params]]
- == {api-query-parms-title}
- `allow_no_match`::
- (Optional, Boolean)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=allow-no-match-models]
- `from`::
- (Optional, integer)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=from-models]
- `size`::
- (Optional, integer)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=size-models]
- [role="child_attributes"]
- [[ml-get-trained-models-stats-results]]
- == {api-response-body-title}
- `count`::
- (integer)
- The total number of trained model statistics that matched the requested ID
- patterns. Could be higher than the number of items in the `trained_model_stats`
- array as the size of the array is restricted by the supplied `size` parameter.
- `trained_model_stats`::
- (array)
- An array of trained model statistics, which are sorted by the `model_id` value
- in ascending order.
- +
- .Properties of trained model stats
- [%collapsible%open]
- ====
- `deployment_stats`:::
- (list)
- A collection of deployment stats if one of the provided `model_id` values
- is deployed
- +
- .Properties of deployment stats
- [%collapsible%open]
- =====
- `allocation_status`:::
- (object)
- The detailed allocation status given the deployment configuration.
- +
- .Properties of allocation stats
- [%collapsible%open]
- ======
- `allocation_count`:::
- (integer)
- The current number of nodes where the model is allocated.
- `cache_size`:::
- (<<byte-units,byte value>>)
- The inference cache size (in memory outside the JVM heap) per node for the model.
- `state`:::
- (string)
- The detailed allocation state related to the nodes.
- +
- --
- * `starting`: Allocations are being attempted but no node currently has the model allocated.
- * `started`: At least one node has the model allocated.
- * `fully_allocated`: The deployment is fully allocated and satisfies the `target_allocation_count`.
- --
- `target_allocation_count`:::
- (integer)
- The desired number of nodes for model allocation.
- ======
- `deployment_id`:::
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=deployment-id]
- `error_count`:::
- (integer)
- The sum of `error_count` for all nodes in the deployment.
- `inference_count`:::
- (integer)
- The sum of `inference_count` for all nodes in the deployment.
- `model_id`:::
- (string)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
- `nodes`:::
- (array of objects)
- The deployment stats for each node that currently has the model allocated.
- +
- .Properties of node stats
- [%collapsible%open]
- ======
- `average_inference_time_ms`:::
- (double)
- The average time for each inference call to complete on this node.
- The average is calculated over the lifetime of the deployment.
- `average_inference_time_ms_excluding_cache_hits`:::
- (double)
- The average time to perform inference on the trained model excluding
- occasions where the response comes from the cache. Cached inference
- calls return very quickly as the model is not evaluated, by excluding
- cache hits this value is an accurate measure of the average time taken
- to evaluate the model.
- `average_inference_time_ms_last_minute`:::
- (double)
- The average time for each inference call to complete on this node
- in the last minute.
- `error_count`:::
- (integer)
- The number of errors when evaluating the trained model.
- `inference_cache_hit_count`:::
- (integer)
- The total number of inference calls made against this node for this
- model that were served from the inference cache.
- `inference_cache_hit_count_last_minute`:::
- (integer)
- The number of inference calls made against this node for this model
- in the last minute that were served from the inference cache.
- `inference_count`:::
- (integer)
- The total number of inference calls made against this node for this model.
- `last_access`:::
- (long)
- The epoch time stamp of the last inference call for the model on this node.
- `node`:::
- (object)
- Information pertaining to the node.
- +
- .Properties of node
- [%collapsible%open]
- ========
- `attributes`:::
- (object)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-attributes]
- `ephemeral_id`:::
- (string)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id]
- `id`:::
- (string)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-id]
- `name`:::
- (string) The node name.
- `transport_address`:::
- (string)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-transport-address]
- ========
- `number_of_allocations`:::
- (integer)
- The number of allocations assigned to this node.
- `number_of_pending_requests`:::
- (integer)
- The number of inference requests queued to be processed.
- `peak_throughput_per_minute`:::
- (integer)
- The peak number of requests processed in a 1 minute period.
- `routing_state`:::
- (object)
- The current routing state and reason for the current routing state for this allocation.
- +
- .Properties of routing_state
- [%collapsible%open]
- ========
- `reason`:::
- (string)
- The reason for the current state. Usually only populated when the `routing_state` is `failed`.
- `routing_state`:::
- (string)
- The current routing state.
- --
- * `starting`: The model is attempting to allocate on this model, inference calls are not yet accepted.
- * `started`: The model is allocated and ready to accept inference requests.
- * `stopping`: The model is being deallocated from this node.
- * `stopped`: The model is fully deallocated from this node.
- * `failed`: The allocation attempt failed, see `reason` field for the potential cause.
- --
- ========
- `rejected_execution_count`:::
- (integer)
- The number of inference requests that were not processed because the
- queue was full.
- `start_time`:::
- (long)
- The epoch timestamp when the allocation started.
- `threads_per_allocation`:::
- (integer)
- The number of threads for each allocation during inference.
- This value is limited by the number of hardware threads on the node;
- it might therefore differ from the `threads_per_allocation` value in the <<start-trained-model-deployment>> API.
- `timeout_count`:::
- (integer)
- The number of inference requests that timed out before being processed.
- `throughput_last_minute`:::
- (integer)
- The number of requests processed in the last 1 minute.
- ======
- `number_of_allocations`:::
- (integer)
- The requested number of allocations for the trained model deployment.
- `peak_throughput_per_minute`:::
- (integer)
- The peak number of requests processed in a 1 minute period for
- all nodes in the deployment. This is calculated as the sum of
- each node's `peak_throughput_per_minute` value.
- `priority`:::
- (string)
- The deployment priority.
- `rejected_execution_count`:::
- (integer)
- The sum of `rejected_execution_count` for all nodes in the deployment.
- Individual nodes reject an inference request if the inference queue is full.
- The queue size is controlled by the `queue_capacity` setting in the
- <<start-trained-model-deployment>> API.
- `reason`:::
- (string)
- The reason for the current deployment state.
- Usually only populated when the model is not deployed to a node.
- `start_time`:::
- (long)
- The epoch timestamp when the deployment started.
- `state`:::
- (string)
- The overall state of the deployment. The values may be:
- +
- --
- * `starting`: The deployment has recently started but is not yet usable as the model is not allocated on any nodes.
- * `started`: The deployment is usable as at least one node has the model allocated.
- * `stopping`: The deployment is preparing to stop and deallocate the model from the relevant nodes.
- --
- `threads_per_allocation`:::
- (integer)
- The number of threads per allocation used by the inference process.
- `timeout_count`:::
- (integer)
- The sum of `timeout_count` for all nodes in the deployment.
- `queue_capacity`:::
- (integer)
- The number of inference requests that may be queued before new requests are
- rejected.
- =====
- `inference_stats`:::
- (object)
- A collection of inference stats fields.
- +
- .Properties of inference stats
- [%collapsible%open]
- =====
- `missing_all_fields_count`:::
- (integer)
- The number of inference calls where all the training features for the model
- were missing.
- `inference_count`:::
- (integer)
- The total number of times the model has been called for inference.
- This is across all inference contexts, including all pipelines.
- `cache_miss_count`:::
- (integer)
- The number of times the model was loaded for inference and was not retrieved
- from the cache. If this number is close to the `inference_count`, then the cache
- is not being appropriately used. This can be solved by increasing the cache size
- or its time-to-live (TTL). See <<general-ml-settings>> for the appropriate
- settings.
- `failure_count`:::
- (integer)
- The number of failures when using the model for inference.
- `timestamp`:::
- (<<time-units,time units>>)
- The time when the statistics were last updated.
- =====
- `ingest`:::
- (object)
- A collection of ingest stats for the model across all nodes. The values are
- summations of the individual node statistics. The format matches the `ingest`
- section in <<cluster-nodes-stats>>.
- `model_id`:::
- (string)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
- `model_size_stats`:::
- (object)
- A collection of model size stats fields.
- +
- .Properties of model size stats
- [%collapsible%open]
- =====
- `model_size_bytes`:::
- (integer)
- The size of the model in bytes.
- `required_native_memory_bytes`:::
- (integer)
- The amount of memory required to load the model in bytes.
- =====
- `pipeline_count`:::
- (integer)
- The number of ingest pipelines that currently refer to the model.
- ====
- [[ml-get-trained-models-stats-response-codes]]
- == {api-response-codes-title}
- `404` (Missing resources)::
- If `allow_no_match` is `false`, this code indicates that there are no
- resources that match the request or only partial matches for the request.
- [[ml-get-trained-models-stats-example]]
- == {api-examples-title}
- The following example gets usage information for all the trained models:
- [source,console]
- --------------------------------------------------
- GET _ml/trained_models/_stats
- --------------------------------------------------
- // TEST[skip:TBD]
- The API returns the following results:
- [source,console-result]
- ----
- {
- "count": 2,
- "trained_model_stats": [
- {
- "model_id": "flight-delay-prediction-1574775339910",
- "pipeline_count": 0,
- "inference_stats": {
- "failure_count": 0,
- "inference_count": 4,
- "cache_miss_count": 3,
- "missing_all_fields_count": 0,
- "timestamp": 1592399986979
- }
- },
- {
- "model_id": "regression-job-one-1574775307356",
- "pipeline_count": 1,
- "inference_stats": {
- "failure_count": 0,
- "inference_count": 178,
- "cache_miss_count": 3,
- "missing_all_fields_count": 0,
- "timestamp": 1592399986979
- },
- "ingest": {
- "total": {
- "count": 178,
- "time_in_millis": 8,
- "current": 0,
- "failed": 0
- },
- "pipelines": {
- "flight-delay": {
- "count": 178,
- "time_in_millis": 8,
- "current": 0,
- "failed": 0,
- "processors": [
- {
- "inference": {
- "type": "inference",
- "stats": {
- "count": 178,
- "time_in_millis": 7,
- "current": 0,
- "failed": 0
- }
- }
- }
- ]
- }
- }
- }
- }
- ]
- }
- ----
- // NOTCONSOLE
|