|
@@ -1,5 +1,5 @@
|
|
|
[[add-elasticsearch-nodes]]
|
|
|
-== Adding nodes to your cluster
|
|
|
+== Add and remove nodes in your cluster
|
|
|
|
|
|
When you start an instance of {es}, you are starting a _node_. An {es} _cluster_
|
|
|
is a group of nodes that have the same `cluster.name` attribute. As nodes join
|
|
@@ -41,3 +41,141 @@ the rest of its cluster.
|
|
|
|
|
|
For more information about discovery and shard allocation, see
|
|
|
<<modules-discovery>> and <<modules-cluster>>.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[add-elasticsearch-nodes-master-eligible]]
|
|
|
+=== Master-eligible nodes
|
|
|
+
|
|
|
+As nodes are added or removed Elasticsearch maintains an optimal level of fault
|
|
|
+tolerance by automatically updating the cluster's _voting configuration_, which
|
|
|
+is the set of <<master-node,master-eligible nodes>> whose responses are counted
|
|
|
+when making decisions such as electing a new master or committing a new cluster
|
|
|
+state.
|
|
|
+
|
|
|
+It is recommended to have a small and fixed number of master-eligible nodes in a
|
|
|
+cluster, and to scale the cluster up and down by adding and removing
|
|
|
+master-ineligible nodes only. However there are situations in which it may be
|
|
|
+desirable to add or remove some master-eligible nodes to or from a cluster.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[modules-discovery-adding-nodes]]
|
|
|
+==== Adding master-eligible nodes
|
|
|
+
|
|
|
+If you wish to add some nodes to your cluster, simply configure the new nodes
|
|
|
+to find the existing cluster and start them up. Elasticsearch adds the new nodes
|
|
|
+to the voting configuration if it is appropriate to do so.
|
|
|
+
|
|
|
+During master election or when joining an existing formed cluster, a node
|
|
|
+sends a join request to the master in order to be officially added to the
|
|
|
+cluster. You can use the `cluster.join.timeout` setting to configure how long a
|
|
|
+node waits after sending a request to join a cluster. Its default value is `30s`.
|
|
|
+See <<modules-discovery-settings>>.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[modules-discovery-removing-nodes]]
|
|
|
+==== Removing master-eligible nodes
|
|
|
+
|
|
|
+When removing master-eligible nodes, it is important not to remove too many all
|
|
|
+at the same time. For instance, if there are currently seven master-eligible
|
|
|
+nodes and you wish to reduce this to three, it is not possible simply to stop
|
|
|
+four of the nodes at once: to do so would leave only three nodes remaining,
|
|
|
+which is less than half of the voting configuration, which means the cluster
|
|
|
+cannot take any further actions.
|
|
|
+
|
|
|
+More precisely, if you shut down half or more of the master-eligible nodes all
|
|
|
+at the same time then the cluster will normally become unavailable. If this
|
|
|
+happens then you can bring the cluster back online by starting the removed
|
|
|
+nodes again.
|
|
|
+
|
|
|
+As long as there are at least three master-eligible nodes in the cluster, as a
|
|
|
+general rule it is best to remove nodes one-at-a-time, allowing enough time for
|
|
|
+the cluster to <<modules-discovery-quorums,automatically adjust>> the voting
|
|
|
+configuration and adapt the fault tolerance level to the new set of nodes.
|
|
|
+
|
|
|
+If there are only two master-eligible nodes remaining then neither node can be
|
|
|
+safely removed since both are required to reliably make progress. To remove one
|
|
|
+of these nodes you must first inform {es} that it should not be part of the
|
|
|
+voting configuration, and that the voting power should instead be given to the
|
|
|
+other node. You can then take the excluded node offline without preventing the
|
|
|
+other node from making progress. A node which is added to a voting
|
|
|
+configuration exclusion list still works normally, but {es} tries to remove it
|
|
|
+from the voting configuration so its vote is no longer required. Importantly,
|
|
|
+{es} will never automatically move a node on the voting exclusions list back
|
|
|
+into the voting configuration. Once an excluded node has been successfully
|
|
|
+auto-reconfigured out of the voting configuration, it is safe to shut it down
|
|
|
+without affecting the cluster's master-level availability. A node can be added
|
|
|
+to the voting configuration exclusion list using the
|
|
|
+<<voting-config-exclusions>> API. For example:
|
|
|
+
|
|
|
+[source,console]
|
|
|
+--------------------------------------------------
|
|
|
+# Add node to voting configuration exclusions list and wait for the system
|
|
|
+# to auto-reconfigure the node out of the voting configuration up to the
|
|
|
+# default timeout of 30 seconds
|
|
|
+POST /_cluster/voting_config_exclusions?node_names=node_name
|
|
|
+
|
|
|
+# Add node to voting configuration exclusions list and wait for
|
|
|
+# auto-reconfiguration up to one minute
|
|
|
+POST /_cluster/voting_config_exclusions?node_names=node_name&timeout=1m
|
|
|
+--------------------------------------------------
|
|
|
+// TEST[skip:this would break the test cluster if executed]
|
|
|
+
|
|
|
+The nodes that should be added to the exclusions list are specified by name
|
|
|
+using the `?node_names` query parameter, or by their persistent node IDs using
|
|
|
+the `?node_ids` query parameter. If a call to the voting configuration
|
|
|
+exclusions API fails, you can safely retry it. Only a successful response
|
|
|
+guarantees that the node has actually been removed from the voting configuration
|
|
|
+and will not be reinstated. If the elected master node is excluded from the
|
|
|
+voting configuration then it will abdicate to another master-eligible node that
|
|
|
+is still in the voting configuration if such a node is available.
|
|
|
+
|
|
|
+Although the voting configuration exclusions API is most useful for down-scaling
|
|
|
+a two-node to a one-node cluster, it is also possible to use it to remove
|
|
|
+multiple master-eligible nodes all at the same time. Adding multiple nodes to
|
|
|
+the exclusions list has the system try to auto-reconfigure all of these nodes
|
|
|
+out of the voting configuration, allowing them to be safely shut down while
|
|
|
+keeping the cluster available. In the example described above, shrinking a
|
|
|
+seven-master-node cluster down to only have three master nodes, you could add
|
|
|
+four nodes to the exclusions list, wait for confirmation, and then shut them
|
|
|
+down simultaneously.
|
|
|
+
|
|
|
+NOTE: Voting exclusions are only required when removing at least half of the
|
|
|
+master-eligible nodes from a cluster in a short time period. They are not
|
|
|
+required when removing master-ineligible nodes, nor are they required when
|
|
|
+removing fewer than half of the master-eligible nodes.
|
|
|
+
|
|
|
+Adding an exclusion for a node creates an entry for that node in the voting
|
|
|
+configuration exclusions list, which has the system automatically try to
|
|
|
+reconfigure the voting configuration to remove that node and prevents it from
|
|
|
+returning to the voting configuration once it has removed. The current list of
|
|
|
+exclusions is stored in the cluster state and can be inspected as follows:
|
|
|
+
|
|
|
+[source,console]
|
|
|
+--------------------------------------------------
|
|
|
+GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions
|
|
|
+--------------------------------------------------
|
|
|
+
|
|
|
+This list is limited in size by the `cluster.max_voting_config_exclusions`
|
|
|
+setting, which defaults to `10`. See <<modules-discovery-settings>>. Since
|
|
|
+voting configuration exclusions are persistent and limited in number, they must
|
|
|
+be cleaned up. Normally an exclusion is added when performing some maintenance
|
|
|
+on the cluster, and the exclusions should be cleaned up when the maintenance is
|
|
|
+complete. Clusters should have no voting configuration exclusions in normal
|
|
|
+operation.
|
|
|
+
|
|
|
+If a node is excluded from the voting configuration because it is to be shut
|
|
|
+down permanently, its exclusion can be removed after it is shut down and removed
|
|
|
+from the cluster. Exclusions can also be cleared if they were created in error
|
|
|
+or were only required temporarily by specifying `?wait_for_removal=false`.
|
|
|
+
|
|
|
+[source,console]
|
|
|
+--------------------------------------------------
|
|
|
+# Wait for all the nodes with voting configuration exclusions to be removed from
|
|
|
+# the cluster and then remove all the exclusions, allowing any node to return to
|
|
|
+# the voting configuration in the future.
|
|
|
+DELETE /_cluster/voting_config_exclusions
|
|
|
+
|
|
|
+# Immediately remove all the voting configuration exclusions, allowing any node
|
|
|
+# to return to the voting configuration in the future.
|
|
|
+DELETE /_cluster/voting_config_exclusions?wait_for_removal=false
|
|
|
+--------------------------------------------------
|