allocation-explain.asciidoc 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351
  1. [[cluster-allocation-explain]]
  2. === Cluster allocation explain API
  3. ++++
  4. <titleabbrev>Cluster allocation explain</titleabbrev>
  5. ++++
  6. Provides explanations for shard allocations in the cluster.
  7. [[cluster-allocation-explain-api-request]]
  8. ==== {api-request-title}
  9. `GET /_cluster/allocation/explain`
  10. [[cluster-allocation-explain-api-prereqs]]
  11. ==== {api-prereq-title}
  12. * If the {es} {security-features} are enabled, you must have the `monitor` or
  13. `manage` <<privileges-list-cluster,cluster privilege>> to use this API.
  14. [[cluster-allocation-explain-api-desc]]
  15. ==== {api-description-title}
  16. The purpose of the cluster allocation explain API is to provide
  17. explanations for shard allocations in the cluster. For unassigned shards,
  18. the explain API provides an explanation for why the shard is unassigned.
  19. For assigned shards, the explain API provides an explanation for why the
  20. shard is remaining on its current node and has not moved or rebalanced to
  21. another node. This API can be very useful when attempting to diagnose why a
  22. shard is unassigned or why a shard continues to remain on its current node when
  23. you might expect otherwise.
  24. [[cluster-allocation-explain-api-query-params]]
  25. ==== {api-query-parms-title}
  26. `include_disk_info`::
  27. (Optional, Boolean) If `true`, returns information about disk usage and
  28. shard sizes. Defaults to `false`.
  29. `include_yes_decisions`::
  30. (Optional, Boolean) If `true`, returns 'YES' decisions in explanation.
  31. Defaults to `false`.
  32. [[cluster-allocation-explain-api-request-body]]
  33. ==== {api-request-body-title}
  34. `current_node`::
  35. (Optional, string) Specifies the node ID or the name of the node to only
  36. explain a shard that is currently located on the specified node.
  37. `index`::
  38. (Optional, string) Specifies the name of the index that you would like an
  39. explanation for.
  40. `primary`::
  41. (Optional, Boolean) If `true`, returns explanation for the primary shard
  42. for the given shard ID.
  43. `shard`::
  44. (Optional, integer) Specifies the ID of the shard that you would like an
  45. explanation for.
  46. You can also have {es} explain the allocation of the first unassigned shard that
  47. it finds by sending an empty body for the request.
  48. [[cluster-allocation-explain-api-examples]]
  49. ==== {api-examples-title}
  50. //////
  51. [source,console]
  52. --------------------------------------------------
  53. PUT /my-index-000001
  54. --------------------------------------------------
  55. // TESTSETUP
  56. //////
  57. [source,console]
  58. --------------------------------------------------
  59. GET /_cluster/allocation/explain
  60. {
  61. "index": "my-index-000001",
  62. "shard": 0,
  63. "primary": true
  64. }
  65. --------------------------------------------------
  66. ===== Example of the current_node parameter
  67. [source,console]
  68. --------------------------------------------------
  69. GET /_cluster/allocation/explain
  70. {
  71. "index": "my-index-000001",
  72. "shard": 0,
  73. "primary": false,
  74. "current_node": "nodeA" <1>
  75. }
  76. --------------------------------------------------
  77. // TEST[skip:no way of knowing the current_node]
  78. <1> The node where shard 0 currently has a replica on
  79. ===== Examples of unassigned primary shard explanations
  80. //////
  81. [source,console]
  82. --------------------------------------------------
  83. DELETE my-index-000001
  84. --------------------------------------------------
  85. //////
  86. [source,console]
  87. --------------------------------------------------
  88. PUT /my-index-000001?master_timeout=1s&timeout=1s
  89. {
  90. "settings": {
  91. "index.routing.allocation.include._name": "non_existent_node",
  92. "index.routing.allocation.include._tier_preference": null
  93. }
  94. }
  95. GET /_cluster/allocation/explain
  96. {
  97. "index": "my-index-000001",
  98. "shard": 0,
  99. "primary": true
  100. }
  101. --------------------------------------------------
  102. // TEST[continued]
  103. The API returns the following response for an unassigned primary shard:
  104. [source,console-result]
  105. --------------------------------------------------
  106. {
  107. "index" : "my-index-000001",
  108. "shard" : 0,
  109. "primary" : true,
  110. "current_state" : "unassigned", <1>
  111. "unassigned_info" : {
  112. "reason" : "INDEX_CREATED", <2>
  113. "at" : "2017-01-04T18:08:16.600Z",
  114. "last_allocation_status" : "no"
  115. },
  116. "can_allocate" : "no", <3>
  117. "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  118. "node_allocation_decisions" : [
  119. {
  120. "node_id" : "8qt2rY-pT6KNZB3-hGfLnw",
  121. "node_name" : "node-0",
  122. "transport_address" : "127.0.0.1:9401",
  123. "node_attributes" : {},
  124. "node_decision" : "no", <4>
  125. "weight_ranking" : 1,
  126. "deciders" : [
  127. {
  128. "decider" : "filter", <5>
  129. "decision" : "NO",
  130. "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]" <6>
  131. }
  132. ]
  133. }
  134. ]
  135. }
  136. --------------------------------------------------
  137. // TESTRESPONSE[s/"at" : "[^"]*"/"at" : $body.$_path/]
  138. // TESTRESPONSE[s/"node_id" : "[^"]*"/"node_id" : $body.$_path/]
  139. // TESTRESPONSE[s/"transport_address" : "[^"]*"/"transport_address" : $body.$_path/]
  140. // TESTRESPONSE[s/"node_attributes" : \{\}/"node_attributes" : $body.$_path/]
  141. <1> The current state of the shard.
  142. <2> The reason for the shard originally becoming unassigned.
  143. <3> Whether to allocate the shard.
  144. <4> Whether to allocate the shard to the particular node.
  145. <5> The decider which led to the `no` decision for the node.
  146. <6> An explanation as to why the decider returned a `no` decision, with a helpful hint pointing to the setting that led to the decision.
  147. The API response output for an unassigned primary shard that had previously been
  148. allocated to a node in the cluster:
  149. [source,js]
  150. --------------------------------------------------
  151. {
  152. "index" : "my-index-000001",
  153. "shard" : 0,
  154. "primary" : true,
  155. "current_state" : "unassigned",
  156. "unassigned_info" : {
  157. "reason" : "NODE_LEFT",
  158. "at" : "2017-01-04T18:03:28.464Z",
  159. "details" : "node_left[OIWe8UhhThCK0V5XfmdrmQ]",
  160. "last_allocation_status" : "no_valid_shard_copy"
  161. },
  162. "can_allocate" : "no_valid_shard_copy",
  163. "allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster"
  164. }
  165. --------------------------------------------------
  166. // NOTCONSOLE
  167. ===== Example of an unassigned replica shard explanation
  168. The API response output for a replica that is unassigned due to delayed
  169. allocation:
  170. [source,js]
  171. --------------------------------------------------
  172. {
  173. "index" : "my-index-000001",
  174. "shard" : 0,
  175. "primary" : false,
  176. "current_state" : "unassigned",
  177. "unassigned_info" : {
  178. "reason" : "NODE_LEFT",
  179. "at" : "2017-01-04T18:53:59.498Z",
  180. "details" : "node_left[G92ZwuuaRY-9n8_tc-IzEg]",
  181. "last_allocation_status" : "no_attempt"
  182. },
  183. "can_allocate" : "allocation_delayed",
  184. "allocate_explanation" : "cannot allocate because the cluster is still waiting 59.8s for the departed node holding a replica to rejoin, despite being allowed to allocate the shard to at least one other node",
  185. "configured_delay" : "1m", <1>
  186. "configured_delay_in_millis" : 60000,
  187. "remaining_delay" : "59.8s", <2>
  188. "remaining_delay_in_millis" : 59824,
  189. "node_allocation_decisions" : [
  190. {
  191. "node_id" : "pmnHu_ooQWCPEFobZGbpWw",
  192. "node_name" : "node_t2",
  193. "transport_address" : "127.0.0.1:9402",
  194. "node_decision" : "yes"
  195. },
  196. {
  197. "node_id" : "3sULLVJrRneSg0EfBB-2Ew",
  198. "node_name" : "node_t0",
  199. "transport_address" : "127.0.0.1:9400",
  200. "node_decision" : "no",
  201. "store" : { <3>
  202. "matching_size" : "4.2kb",
  203. "matching_size_in_bytes" : 4325
  204. },
  205. "deciders" : [
  206. {
  207. "decider" : "same_shard",
  208. "decision" : "NO",
  209. "explanation" : "a copy of this shard is already allocated to this node [[my-index-000001][0], node[3sULLVJrRneSg0EfBB-2Ew], [P], s[STARTED], a[id=eV9P8BN1QPqRc3B4PLx6cg]]"
  210. }
  211. ]
  212. }
  213. ]
  214. }
  215. --------------------------------------------------
  216. // NOTCONSOLE
  217. <1> The configured delay before allocating a replica shard that does not exist due to the node holding it leaving the cluster.
  218. <2> The remaining delay before allocating the replica shard.
  219. <3> Information about the shard data found on a node.
  220. ===== Examples of allocated shard explanations
  221. The API response output for an assigned shard that is not allowed to remain on
  222. its current node and is required to move:
  223. [source,js]
  224. --------------------------------------------------
  225. {
  226. "index" : "my-index-000001",
  227. "shard" : 0,
  228. "primary" : true,
  229. "current_state" : "started",
  230. "current_node" : {
  231. "id" : "8lWJeJ7tSoui0bxrwuNhTA",
  232. "name" : "node_t1",
  233. "transport_address" : "127.0.0.1:9401"
  234. },
  235. "can_remain_on_current_node" : "no", <1>
  236. "can_remain_decisions" : [ <2>
  237. {
  238. "decider" : "filter",
  239. "decision" : "NO",
  240. "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]"
  241. }
  242. ],
  243. "can_move_to_other_node" : "no", <3>
  244. "move_explanation" : "cannot move shard to another node, even though it is not allowed to remain on its current node",
  245. "node_allocation_decisions" : [
  246. {
  247. "node_id" : "_P8olZS8Twax9u6ioN-GGA",
  248. "node_name" : "node_t0",
  249. "transport_address" : "127.0.0.1:9400",
  250. "node_decision" : "no",
  251. "weight_ranking" : 1,
  252. "deciders" : [
  253. {
  254. "decider" : "filter",
  255. "decision" : "NO",
  256. "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]"
  257. }
  258. ]
  259. }
  260. ]
  261. }
  262. --------------------------------------------------
  263. // NOTCONSOLE
  264. <1> Whether the shard is allowed to remain on its current node.
  265. <2> The deciders that factored into the decision of why the shard is not allowed to remain on its current node.
  266. <3> Whether the shard is allowed to be allocated to another node.
  267. The API response output for an assigned shard that remains on its current node
  268. because moving the shard to another node does not form a better cluster balance:
  269. [source,js]
  270. --------------------------------------------------
  271. {
  272. "index" : "my-index-000001",
  273. "shard" : 0,
  274. "primary" : true,
  275. "current_state" : "started",
  276. "current_node" : {
  277. "id" : "wLzJm4N4RymDkBYxwWoJsg",
  278. "name" : "node_t0",
  279. "transport_address" : "127.0.0.1:9400",
  280. "weight_ranking" : 1
  281. },
  282. "can_remain_on_current_node" : "yes",
  283. "can_rebalance_cluster" : "yes", <1>
  284. "can_rebalance_to_other_node" : "no", <2>
  285. "rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
  286. "node_allocation_decisions" : [
  287. {
  288. "node_id" : "oE3EGFc8QN-Tdi5FFEprIA",
  289. "node_name" : "node_t1",
  290. "transport_address" : "127.0.0.1:9401",
  291. "node_decision" : "worse_balance", <3>
  292. "weight_ranking" : 1
  293. }
  294. ]
  295. }
  296. --------------------------------------------------
  297. // NOTCONSOLE
  298. <1> Whether rebalancing is allowed on the cluster.
  299. <2> Whether the shard can be rebalanced to another node.
  300. <3> The reason the shard cannot be rebalanced to the node, in this case indicating that it offers no better balance than the current node.