get-trained-model-deployment-stats.asciidoc 9.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[get-trained-model-deployment-stats]]
  4. = Get trained model deployment statistics API
  5. [subs="attributes"]
  6. ++++
  7. <titleabbrev>Get trained model deployment stats</titleabbrev>
  8. ++++
  9. Retrieves usage information for trained model deployments.
  10. [[ml-get-trained-model-deployment-stats-request]]
  11. == {api-request-title}
  12. `GET _ml/trained_models/<model_id>/deployment/_stats` +
  13. `GET _ml/trained_models/<model_id>,<model_id_2>/deployment/_stats` +
  14. `GET _ml/trained_models/<model_id_pattern*>,<model_id_2>/deployment/_stats`
  15. [[ml-get-trained-model-deployment-stats-prereq]]
  16. == {api-prereq-title}
  17. Requires the `monitor_ml` cluster privilege. This privilege is included in the
  18. `machine_learning_user` built-in role.
  19. [[ml-get-trained-model-deployment-stats-desc]]
  20. == {api-description-title}
  21. You can get deployment information for multiple trained models in a single API
  22. request by using a comma-separated list of model IDs or a wildcard expression.
  23. [[ml-get-trained-model-deployment-stats-path-params]]
  24. == {api-path-parms-title}
  25. `<model_id>`::
  26. (Optional, string)
  27. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
  28. [[ml-get-trained-model-deployment-stats-query-params]]
  29. == {api-query-parms-title}
  30. `allow_no_match`::
  31. (Optional, Boolean)
  32. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=allow-no-match-models]
  33. [role="child_attributes"]
  34. [[ml-get-trained-model-deployment-stats-results]]
  35. == {api-response-body-title}
  36. `count`::
  37. (integer)
  38. The total number of deployment statistics that matched the requested ID
  39. patterns.
  40. `deployment_stats`::
  41. (array)
  42. An array of trained model deployment statistics, which are sorted by the `model_id` value
  43. in ascending order.
  44. +
  45. .Properties of trained model deployment stats
  46. [%collapsible%open]
  47. ====
  48. `model_id`:::
  49. (string)
  50. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
  51. `model_size`:::
  52. (<<byte-units,byte value>>)
  53. The size of the loaded model in bytes.
  54. `state`:::
  55. (string)
  56. The overall state of the deployment. The values may be:
  57. +
  58. --
  59. * `starting`: The deployment has recently started but is not yet usable as the model is not allocated on any nodes.
  60. * `started`: The deployment is usable as at least one node has the model allocated.
  61. * `stopping`: The deployment is preparing to stop and un-allocate the model from the relevant nodes.
  62. --
  63. `allocation_status`:::
  64. (object)
  65. The detailed allocation status given the deployment configuration.
  66. +
  67. .Properties of allocation stats
  68. [%collapsible%open]
  69. =====
  70. `allocation_count`:::
  71. (integer)
  72. The current number of nodes where the model is allocated.
  73. `target_allocation_count`:::
  74. (integer)
  75. The desired number of nodes for model allocation.
  76. `state`:::
  77. (string)
  78. The detailed allocation state related to the nodes.
  79. +
  80. --
  81. * `starting`: Allocations are being attempted but no node currently has the model allocated.
  82. * `started`: At least one node has the model allocated.
  83. * `fully_allocated`: The deployment is fully allocated and satisfies the `target_allocation_count`.
  84. --
  85. =====
  86. `nodes`:::
  87. (array of objects)
  88. The deployment stats for each node that currently has the model allocated.
  89. +
  90. .Properties of node stats
  91. [%collapsible%open]
  92. =====
  93. `average_inference_time_ms`:::
  94. (double)
  95. The average time for each inference call to complete on this node.
  96. `inference_count`:::
  97. (integer)
  98. The total number of inference calls made against this node for this model.
  99. `last_access`:::
  100. (long)
  101. The epoch time stamp of the last inference call for the model on this node.
  102. `node`:::
  103. (object)
  104. Information pertaining to the node.
  105. +
  106. .Properties of node
  107. [%collapsible%open]
  108. ======
  109. `attributes`:::
  110. (object)
  111. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-attributes]
  112. `ephemeral_id`:::
  113. (string)
  114. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id]
  115. `id`:::
  116. (string)
  117. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-id]
  118. `name`:::
  119. (string) The node name.
  120. `transport_address`:::
  121. (string)
  122. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-transport-address]
  123. ======
  124. `routing_state`:::
  125. (object)
  126. The current routing state and reason for the current routing state for this allocation.
  127. +
  128. --
  129. * `starting`: The model is attempting to allocate on this model, inference calls are not yet accepted.
  130. * `started`: The model is allocated and ready to accept inference requests.
  131. * `stopping`: The model is being de-allocated from this node.
  132. * `stopped`: The model is fully de-allocated from this node.
  133. * `failed`: The allocation attempt failed, see `reason` field for the potential cause.
  134. --
  135. `reason`:::
  136. (string)
  137. The reason for the current state. Usually only populated when the `routing_state` is `failed`.
  138. =====
  139. ====
  140. [[ml-get-trained-model-deployment-stats-response-codes]]
  141. == {api-response-codes-title}
  142. `404` (Missing resources)::
  143. If `allow_no_match` is `false`, this code indicates that there are no
  144. resources that match the request or only partial matches for the request.
  145. [[ml-get-trained-model-deployment-stats-example]]
  146. == {api-examples-title}
  147. The following example gets deployment information for all currently started model deployments:
  148. [source,console]
  149. --------------------------------------------------
  150. GET _ml/trained_models/*/deployment/_stats
  151. --------------------------------------------------
  152. // TEST[skip:TBD]
  153. The API returns the following results:
  154. [source,console-result]
  155. ----
  156. {
  157. "count": 2,
  158. "deployment_stats": [
  159. {
  160. "model_id": "elastic__distilbert-base-uncased-finetuned-conll03-english",
  161. "model_size": "253.3mb",
  162. "state": "started",
  163. "allocation_status": {
  164. "allocation_count": 1,
  165. "target_allocation_count": 1,
  166. "state": "fully_allocated"
  167. },
  168. "nodes": [
  169. {
  170. "node": {
  171. "6pzZQ9OmQUWAaswMlwVEwg": {
  172. "name": "runTask-0",
  173. "ephemeral_id": "aI1OwkPMRCiAJ_1XkEAqdw",
  174. "transport_address": "127.0.0.1:9300",
  175. "attributes": {
  176. "ml.machine_memory": "68719476736",
  177. "xpack.installed": "true",
  178. "testattr": "test",
  179. "ml.max_open_jobs": "512",
  180. "ml.max_jvm_size": "4181590016"
  181. },
  182. "roles": [
  183. "data",
  184. "data_cold",
  185. "data_content",
  186. "data_frozen",
  187. "data_hot",
  188. "data_warm",
  189. "ingest",
  190. "master",
  191. "ml",
  192. "remote_cluster_client",
  193. "transform"
  194. ]
  195. }
  196. },
  197. "routing_state": {
  198. "routing_state": "started"
  199. },
  200. "inference_count": 9,
  201. "average_inference_time_ms": 51,
  202. "last_access": 1632855681069
  203. }
  204. ]
  205. },
  206. {
  207. "model_id": "typeform__distilbert-base-uncased-mnli",
  208. "model_size": "255.5mb",
  209. "state": "started",
  210. "allocation_status": {
  211. "allocation_count": 1,
  212. "target_allocation_count": 1,
  213. "state": "fully_allocated"
  214. },
  215. "nodes": [
  216. {
  217. "node": {
  218. "6pzZQ9OmQUWAaswMlwVEwg": {
  219. "name": "runTask-0",
  220. "ephemeral_id": "aI1OwkPMRCiAJ_1XkEAqdw",
  221. "transport_address": "127.0.0.1:9300",
  222. "attributes": {
  223. "ml.machine_memory": "68719476736",
  224. "xpack.installed": "true",
  225. "testattr": "test",
  226. "ml.max_open_jobs": "512",
  227. "ml.max_jvm_size": "4181590016"
  228. },
  229. "roles": [
  230. "data",
  231. "data_cold",
  232. "data_content",
  233. "data_frozen",
  234. "data_hot",
  235. "data_warm",
  236. "ingest",
  237. "master",
  238. "ml",
  239. "remote_cluster_client",
  240. "transform"
  241. ]
  242. }
  243. },
  244. "routing_state": {
  245. "routing_state": "started"
  246. },
  247. "inference_count": 0,
  248. "average_inference_time_ms": 0
  249. }
  250. ]
  251. }
  252. ]
  253. }
  254. ----
  255. // NOTCONSOLE