quorums.asciidoc 3.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
  1. [[modules-discovery-quorums]]
  2. === Quorum-based decision making
  3. Electing a master node and changing the cluster state are the two fundamental
  4. tasks that master-eligible nodes must work together to perform. It is important
  5. that these activities work robustly even if some nodes have failed.
  6. Elasticsearch achieves this robustness by considering each action to have
  7. succeeded on receipt of responses from a _quorum_, which is a subset of the
  8. master-eligible nodes in the cluster. The advantage of requiring only a subset
  9. of the nodes to respond is that it means some of the nodes can fail without
  10. preventing the cluster from making progress. The quorums are carefully chosen so
  11. the cluster does not have a "split brain" scenario where it's partitioned into
  12. two pieces such that each piece may make decisions that are inconsistent with
  13. those of the other piece.
  14. Elasticsearch allows you to add and remove master-eligible nodes to a running
  15. cluster. In many cases you can do this simply by starting or stopping the nodes
  16. as required. See <<modules-discovery-adding-removing-nodes>>.
  17. As nodes are added or removed Elasticsearch maintains an optimal level of fault
  18. tolerance by updating the cluster's <<modules-discovery-voting,voting
  19. configuration>>, which is the set of master-eligible nodes whose responses are
  20. counted when making decisions such as electing a new master or committing a new
  21. cluster state. A decision is made only after more than half of the nodes in the
  22. voting configuration have responded. Usually the voting configuration is the
  23. same as the set of all the master-eligible nodes that are currently in the
  24. cluster. However, there are some situations in which they may be different.
  25. To be sure that the cluster remains available you **must not stop half or more
  26. of the nodes in the voting configuration at the same time**. As long as more
  27. than half of the voting nodes are available the cluster can still work normally.
  28. This means that if there are three or four master-eligible nodes, the cluster
  29. can tolerate one of them being unavailable. If there are two or fewer
  30. master-eligible nodes, they must all remain available.
  31. After a node has joined or left the cluster the elected master must issue a
  32. cluster-state update that adjusts the voting configuration to match, and this
  33. can take a short time to complete. It is important to wait for this adjustment
  34. to complete before removing more nodes from the cluster.
  35. [discrete]
  36. ==== Master elections
  37. Elasticsearch uses an election process to agree on an elected master node, both
  38. at startup and if the existing elected master fails. Any master-eligible node
  39. can start an election, and normally the first election that takes place will
  40. succeed. Elections only usually fail when two nodes both happen to start their
  41. elections at about the same time, so elections are scheduled randomly on each
  42. node to reduce the probability of this happening. Nodes will retry elections
  43. until a master is elected, backing off on failure, so that eventually an
  44. election will succeed (with arbitrarily high probability). The scheduling of
  45. master elections are controlled by the <<master-election-settings,master
  46. election settings>>.
  47. [discrete]
  48. ==== Cluster maintenance, rolling restarts and migrations
  49. Many cluster maintenance tasks involve temporarily shutting down one or more
  50. nodes and then starting them back up again. By default {es} can remain
  51. available if one of its master-eligible nodes is taken offline, such as during a
  52. rolling upgrade. Furthermore, if multiple nodes are stopped
  53. and then started again then it will automatically recover, such as during a
  54. full cluster restart. There is no need to take any further
  55. action with the APIs described here in these cases, because the set of master
  56. nodes is not changing permanently.