get-ml-memory.asciidoc 8.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310
  1. [role="xpack"]
  2. [[get-ml-memory]]
  3. = Get machine learning memory stats API
  4. [subs="attributes"]
  5. ++++
  6. <titleabbrev>Get {ml} memory stats</titleabbrev>
  7. ++++
  8. Returns information on how {ml} is using memory.
  9. [[get-ml-memory-request]]
  10. == {api-request-title}
  11. `GET _ml/memory/_stats` +
  12. `GET _ml/memory/<node_id>/_stats`
  13. [[get-ml-memory-prereqs]]
  14. == {api-prereq-title}
  15. Requires the `monitor_ml` cluster privilege. This privilege is included in the
  16. `machine_learning_user` built-in role.
  17. [[get-ml-memory-desc]]
  18. == {api-description-title}
  19. Get information about how {ml} jobs and trained models are using memory, on each
  20. node, both within the JVM heap, and natively, outside of the JVM.
  21. [[get-ml-memory-path-params]]
  22. == {api-path-parms-title}
  23. `<node_id>`::
  24. (Optional, string) The names of particular nodes in the cluster to target.
  25. For example, `nodeId1,nodeId2` or `ml:true`. For node selection options,
  26. see <<cluster-nodes>>.
  27. [[get-ml-memory-query-parms]]
  28. == {api-query-parms-title}
  29. `human`::
  30. Specify this query parameter to include the fields with units in the response.
  31. Otherwise only the `_in_bytes` sizes are returned in the response.
  32. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=timeoutparms]
  33. [role="child_attributes"]
  34. [[get-ml-memory-response-body]]
  35. == {api-response-body-title}
  36. `_nodes`::
  37. (object)
  38. Contains statistics about the number of nodes selected by the request.
  39. +
  40. .Properties of `_nodes`
  41. [%collapsible%open]
  42. ====
  43. `failed`::
  44. (integer)
  45. Number of nodes that rejected the request or failed to respond. If this value
  46. is not `0`, a reason for the rejection or failure is included in the response.
  47. `successful`::
  48. (integer)
  49. Number of nodes that responded successfully to the request.
  50. `total`::
  51. (integer)
  52. Total number of nodes selected by the request.
  53. ====
  54. `cluster_name`::
  55. (string)
  56. Name of the cluster. Based on the <<cluster-name,cluster.name>> setting.
  57. `nodes`::
  58. (object)
  59. Contains statistics for the nodes selected by the request.
  60. +
  61. .Properties of `nodes`
  62. [%collapsible%open]
  63. ====
  64. `<node_id>`::
  65. (object)
  66. Contains statistics for the node.
  67. +
  68. .Properties of `<node_id>`
  69. [%collapsible%open]
  70. =====
  71. `attributes`::
  72. (object)
  73. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-attributes]
  74. `ephemeral_id`::
  75. (string)
  76. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id]
  77. `jvm`::
  78. (object)
  79. Contains Java Virtual Machine (JVM) statistics for the node.
  80. +
  81. .Properties of `jvm`
  82. [%collapsible%open]
  83. ======
  84. `heap_max`::
  85. (<<byte-units,byte value>>)
  86. Maximum amount of memory available for use by the heap.
  87. `heap_max_in_bytes`::
  88. (integer)
  89. Maximum amount of memory, in bytes, available for use by the heap.
  90. `java_inference`::
  91. (<<byte-units,byte value>>)
  92. Amount of Java heap currently being used for caching inference models.
  93. `java_inference_in_bytes`::
  94. (integer)
  95. Amount of Java heap, in bytes, currently being used for caching inference models.
  96. `java_inference_max`::
  97. (<<byte-units,byte value>>)
  98. Maximum amount of Java heap to be used for caching inference models.
  99. `java_inference_max_in_bytes`::
  100. (integer)
  101. Maximum amount of Java heap, in bytes, to be used for caching inference models.
  102. ======
  103. `mem`::
  104. (object)
  105. Contains statistics about memory usage for the node.
  106. +
  107. .Properties of `mem`
  108. [%collapsible%open]
  109. ======
  110. `adjusted_total`::
  111. (<<byte-units,byte value>>)
  112. If the amount of physical memory has been overridden using the `es.total_memory_bytes`
  113. system property then this reports the overridden value. Otherwise it reports the same
  114. value as `total`.
  115. `adjusted_total_in_bytes`::
  116. (integer)
  117. If the amount of physical memory has been overridden using the `es.total_memory_bytes`
  118. system property then this reports the overridden value in bytes. Otherwise it reports
  119. the same value as `total_in_bytes`.
  120. `ml`::
  121. (object)
  122. Contains statistics about {ml} use of native memory on the node.
  123. +
  124. .Properties of `ml`
  125. [%collapsible%open]
  126. =======
  127. `anomaly_detectors`::
  128. (<<byte-units,byte value>>)
  129. Amount of native memory set aside for {anomaly-jobs}.
  130. `anomaly_detectors_in_bytes`::
  131. (integer)
  132. Amount of native memory, in bytes, set aside for {anomaly-jobs}.
  133. `data_frame_analytics`::
  134. (<<byte-units,byte value>>)
  135. Amount of native memory set aside for {dfanalytics-jobs}.
  136. `data_frame_analytics_in_bytes`::
  137. (integer)
  138. Amount of native memory, in bytes, set aside for {dfanalytics-jobs}.
  139. `max`::
  140. (<<byte-units,byte value>>)
  141. Maximum amount of native memory (separate to the JVM heap) that may be used by {ml}
  142. native processes.
  143. `max_in_bytes`::
  144. (integer)
  145. Maximum amount of native memory (separate to the JVM heap), in bytes, that may be
  146. used by {ml} native processes.
  147. `native_code_overhead`::
  148. (<<byte-units,byte value>>)
  149. Amount of native memory set aside for loading {ml} native code shared libraries.
  150. `native_code_overhead_in_bytes`::
  151. (integer)
  152. Amount of native memory, in bytes, set aside for loading {ml} native code shared libraries.
  153. `native_inference`::
  154. (<<byte-units,byte value>>)
  155. Amount of native memory set aside for trained models that have a PyTorch `model_type`.
  156. `native_inference_in_bytes`::
  157. (integer)
  158. Amount of native memory, in bytes, set aside for trained models that have a PyTorch `model_type`.
  159. =======
  160. `total`::
  161. (<<byte-units,byte value>>)
  162. Total amount of physical memory.
  163. `total_in_bytes`::
  164. (integer)
  165. Total amount of physical memory in bytes.
  166. ======
  167. `name`::
  168. (string)
  169. Human-readable identifier for the node. Based on the <<node-name>> setting.
  170. `roles`::
  171. (array of strings)
  172. Roles assigned to the node. See <<modules-node>>.
  173. `transport_address`::
  174. (string)
  175. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-transport-address]
  176. =====
  177. ====
  178. [[get-ml-memory-example]]
  179. == {api-examples-title}
  180. [source,console]
  181. --------------------------------------------------
  182. GET _ml/memory/_stats?human
  183. --------------------------------------------------
  184. // TEST[setup:node]
  185. This is a possible response:
  186. [source,console-result]
  187. ----
  188. {
  189. "_nodes": {
  190. "total": 1,
  191. "successful": 1,
  192. "failed": 0
  193. },
  194. "cluster_name": "my_cluster",
  195. "nodes": {
  196. "pQHNt5rXTTWNvUgOrdynKg": {
  197. "name": "node-0",
  198. "ephemeral_id": "ITZ6WGZnSqqeT_unfit2SQ",
  199. "transport_address": "127.0.0.1:9300",
  200. "attributes": {
  201. "ml.machine_memory": "68719476736",
  202. "ml.max_jvm_size": "536870912"
  203. },
  204. "roles": [
  205. "data",
  206. "data_cold",
  207. "data_content",
  208. "data_frozen",
  209. "data_hot",
  210. "data_warm",
  211. "ingest",
  212. "master",
  213. "ml",
  214. "remote_cluster_client",
  215. "transform"
  216. ],
  217. "mem": {
  218. "total": "64gb",
  219. "total_in_bytes": 68719476736,
  220. "adjusted_total": "64gb",
  221. "adjusted_total_in_bytes": 68719476736,
  222. "ml": {
  223. "max": "19.1gb",
  224. "max_in_bytes": 20615843020,
  225. "native_code_overhead": "0b",
  226. "native_code_overhead_in_bytes": 0,
  227. "anomaly_detectors": "0b",
  228. "anomaly_detectors_in_bytes": 0,
  229. "data_frame_analytics": "0b",
  230. "data_frame_analytics_in_bytes": 0,
  231. "native_inference": "0b",
  232. "native_inference_in_bytes": 0
  233. }
  234. },
  235. "jvm": {
  236. "heap_max": "512mb",
  237. "heap_max_in_bytes": 536870912,
  238. "java_inference_max": "204.7mb",
  239. "java_inference_max_in_bytes": 214748364,
  240. "java_inference": "0b",
  241. "java_inference_in_bytes": 0
  242. }
  243. }
  244. }
  245. }
  246. ----
  247. // TESTRESPONSE[s/"cluster_name": "my_cluster"/"cluster_name": $body.cluster_name/]
  248. // TESTRESPONSE[s/"pQHNt5rXTTWNvUgOrdynKg"/\$node_name/]
  249. // TESTRESPONSE[s/"ephemeral_id": "ITZ6WGZnSqqeT_unfit2SQ"/"ephemeral_id": "$body.$_path"/]
  250. // TESTRESPONSE[s/"transport_address": "127.0.0.1:9300"/"transport_address": "$body.$_path"/]
  251. // TESTRESPONSE[s/"attributes": \{[^\}]*\}/"attributes": $body.$_path/]
  252. // TESTRESPONSE[s/"total": "64gb"/"total": "$body.$_path"/]
  253. // TESTRESPONSE[s/"total_in_bytes": 68719476736/"total_in_bytes": $body.$_path/]
  254. // TESTRESPONSE[s/"adjusted_total": "64gb"/"adjusted_total": "$body.$_path"/]
  255. // TESTRESPONSE[s/"adjusted_total_in_bytes": 68719476736/"adjusted_total_in_bytes": $body.$_path/]
  256. // TESTRESPONSE[s/"max": "19.1gb"/"max": "$body.$_path"/]
  257. // TESTRESPONSE[s/"max_in_bytes": 20615843020/"max_in_bytes": $body.$_path/]
  258. // TESTRESPONSE[s/"heap_max": "512mb"/"heap_max": "$body.$_path"/]
  259. // TESTRESPONSE[s/"heap_max_in_bytes": 536870912/"heap_max_in_bytes": $body.$_path/]
  260. // TESTRESPONSE[s/"java_inference_max": "204.7mb"/"java_inference_max": "$body.$_path"/]
  261. // TESTRESPONSE[s/"java_inference_max_in_bytes": 214748364/"java_inference_max_in_bytes": $body.$_path/]