recovery.asciidoc 2.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
  1. [[cat-recovery]]
  2. == Recovery
  3. `recovery` is a view of shard replication. It will show information
  4. anytime data from at least one shard is copying to a different node.
  5. It can also show up on cluster restarts. If your recovery process
  6. seems stuck, try it to see if there's any movement.
  7. As an example, let's enable replicas on a cluster which has two
  8. indices, three shards each. Afterward we'll have twelve total shards,
  9. but before those replica shards are `STARTED`, we'll take a snapshot
  10. of the recovery:
  11. [source,shell]
  12. --------------------------------------------------
  13. % curl -XPUT 192.168.56.30:9200/_settings -d'{"number_of_replicas":1}'
  14. {"acknowledged":true}
  15. % curl '192.168.56.30:9200/_cat/recovery?v'
  16. index shard target recovered % ip node
  17. wiki1 2 68083830 7865837 11.6% 192.168.56.20 Adam II
  18. wiki2 1 2542400 444175 17.5% 192.168.56.20 Adam II
  19. wiki2 2 3242108 329039 10.1% 192.168.56.10 Jarella
  20. wiki2 0 2614132 0 0.0% 192.168.56.30 Solarr
  21. wiki1 0 60992898 4719290 7.7% 192.168.56.30 Solarr
  22. wiki1 1 47630362 6798313 14.3% 192.168.56.10 Jarella
  23. --------------------------------------------------
  24. We have six total shards in recovery (a replica for each primary), at
  25. varying points of progress.
  26. Let's restart the cluster and then lose a node. This output shows us
  27. what was moving around shortly after the node left the cluster.
  28. [source,shell]
  29. --------------------------------------------------
  30. % curl 192.168.56.30:9200/_cat/health; curl 192.168.56.30:9200/_cat/recovery
  31. 1384315040 19:57:20 foo yellow 2 2 8 6 0 4 0
  32. wiki2 2 1621477 0 0.0% 192.168.56.30 Garrett, Jonathan "John"
  33. wiki2 0 1307488 0 0.0% 192.168.56.20 Commander Kraken
  34. wiki1 0 32696794 20984240 64.2% 192.168.56.20 Commander Kraken
  35. wiki1 1 31123128 21951695 70.5% 192.168.56.30 Garrett, Jonathan "John"
  36. --------------------------------------------------
  37. [float]
  38. [[big-percent]]
  39. === Why am I seeing recovery percentages greater than 100?
  40. This can happen if a shard copy goes away and comes back while the
  41. primary was indexing. The replica shard will catch up with the
  42. primary by receiving any new segments created during its outage.
  43. These new segments can contain data from segments it already has
  44. because they're the result of merging that happened on the primary,
  45. but now live in different, larger segments. After the new segments
  46. are copied over the replica will delete unneeded segments, resulting
  47. in a dataset that more closely matches the primary (or exactly,
  48. assuming indexing isn't still happening).