5 years ago · 1f28bd07a3
--- a/docs/reference/high-availability.asciidoc
+++ b/docs/reference/high-availability.asciidoc
@@ -3,31 +3,30 @@
 
				 
			
 
				 [partintro]
			
 
				 --
			
 
				-As with any software that stores data,
			
 
				-it is important to routinely back up your data.
			
 
				-{es}'s <<glossary-replica-shard,replica shards>> provide high availability
			
 
				-during runtime;
			
 
				-they enable you to tolerate sporadic node loss
			
 
				-without an interruption of service.
			
 
				-
			
 
				-However, replica shards do not protect an {es} cluster
			
 
				-from catastrophic failure.
			
 
				-You need a backup of your cluster—
			
 
				-a copy in case something goes wrong.
			
 
				-
			
 
				-
			
 
				-{es} offers two features to support high availability for a cluster:
			
 
				-
			
 
				-* <<backup-cluster,Snapshot and restore>>,
			
 
				-which you can use to back up individual indices or entire clusters.
			
 
				-You can automatically store these backups in a repository on a shared filesystem.
			
 
				-
			
 
				-* <<xpack-ccr,Cross-cluster replication (CCR)>>,
			
 
				-which you can use to copy indices in remote clusters to a local cluster.
			
 
				-You can use {ccr} to recover from the failure of a primary cluster
			
 
				-or serve data locally based on geo-proximity.
			
 
				+Your data is important to you. Keeping it safe and available is important
			
 
				+to {es}. Sometimes your cluster may experience hardware failure or a power
			
 
				+loss. To help you plan for this, {es} offers a number of features
			
 
				+to achieve high availability despite failures.
			
 
				+
			
 
				+* With proper planning, a cluster can be
			
 
				+  <<high-availability-cluster-design,designed for resilience>> to many of the
			
 
				+  things that commonly go wrong, from the loss of a single node or network
			
 
				+  connection right up to a zone-wide outage such as power loss.
			
 
				+
			
 
				+* You can use <<xpack-ccr,{ccr}>> to replicate data to a remote _follower_
			
 
				+  cluster which may be in a different data centre or even on a different
			
 
				+  continent from the leader cluster. The follower cluster acts as a hot
			
 
				+  standby, ready for you to fail over in the event of a disaster so severe that
			
 
				+  the leader cluster fails. The follower cluster can also act as a geo-replica
			
 
				+  to serve searches from nearby clients.
			
 
				+
			
 
				+* The last line of defence against data loss is to take
			
 
				+  <<backup-cluster,regular snapshots>> of your cluster so that you can restore
			
 
				+  a completely fresh copy of it elsewhere if needed.
			
 
				 --
			
 
				 
			
 
				+include::high-availability/cluster-design.asciidoc[]
			
 
				+
			
 
				 include::high-availability/backup-cluster.asciidoc[]
			
 
				 
			
 
				 include::ccr/index.asciidoc[]
			
--- a/docs/reference/high-availability/cluster-design.asciidoc
+++ b/docs/reference/high-availability/cluster-design.asciidoc
@@ -0,0 +1,351 @@
 
				+[[high-availability-cluster-design]]
			
 
				+== Designing for resilience
			
 
				+
			
 
				+Distributed systems like {es} are designed to keep working even if some of
			
 
				+their components have failed. As long as there are enough well-connected
			
 
				+nodes to take over their responsibilities, an {es} cluster can continue
			
 
				+operating normally if some of its nodes are unavailable or disconnected.
			
 
				+
			
 
				+There is a limit to how small a resilient cluster can be. All {es} clusters
			
 
				+require:
			
 
				+
			
 
				+* One <<modules-discovery-quorums,elected master node>> node
			
 
				+* At least one copy of every <<scalability,shard>>.
			
 
				+
			
 
				+We also recommend adding a new node to the cluster for each
			
 
				+<<modules-node,role>>.
			
 
				+
			
 
				+A resilient cluster requires redundancy for every required cluster component,
			
 
				+except the elected master node. For resilient clusters, we recommend:
			
 
				+
			
 
				+* One elected master node
			
 
				+* At least three master-eligible nodes
			
 
				+* At least two nodes of each role
			
 
				+* At least two copies of each shard (one primary and one or more replicas)
			
 
				+
			
 
				+A resilient cluster needs three master-eligible nodes so that if one of
			
 
				+them fails then the remaining two still form a majority and can hold a
			
 
				+successful election.
			
 
				+
			
 
				+Similarly, node redundancy makes it likely that if a node for a particular role
			
 
				+fails, another node can take on its responsibilities.
			
 
				+
			
 
				+Finally, a resilient cluster should have at least two copies of each shard. If
			
 
				+one copy fails then there is another good copy to take over. {es} automatically
			
 
				+rebuilds any failed shard copies on the remaining nodes in order to restore the
			
 
				+cluster to full health after a failure.
			
 
				+
			
 
				+Depending on your needs and budget, an {es} cluster can consist of a single
			
 
				+node, hundreds of nodes, or any number in between. When designing a smaller
			
 
				+cluster, you should typically focus on making it resilient to single-node
			
 
				+failures. Designers of larger clusters must also consider cases where multiple
			
 
				+nodes fail at the same time. The following pages give some recommendations for
			
 
				+building resilient clusters of various sizes:
			
 
				+
			
 
				+* <<high-availability-cluster-small-clusters>>
			
 
				+* <<high-availability-cluster-design-large-clusters>>
			
 
				+
			
 
				+[[high-availability-cluster-small-clusters]]
			
 
				+=== Resilience in small clusters
			
 
				+
			
 
				+In smaller clusters, it is most important to be resilient to single-node
			
 
				+failures. This section gives some guidance on making your cluster as resilient
			
 
				+as possible to the failure of an individual node.
			
 
				+
			
 
				+[[high-availability-cluster-design-one-node]]
			
 
				+==== One-node clusters
			
 
				+
			
 
				+If your cluster consists of one node, that single node must do everything.
			
 
				+To accommodate this, {es} assigns nodes every role by default.
			
 
				+
			
 
				+A single node cluster is not resilient. If the the node fails, the cluster will
			
 
				+stop working. Because there are no replicas in a one-node cluster, you cannot
			
 
				+store your data redundantly. However, at least one replica is required for a
			
 
				+<<cluster-health,`green` cluster health status>>. To ensure your cluster can
			
 
				+report a `green` status, set
			
 
				+<<dynamic-index-settings,`index.number_of_replicas`>> to `0` on every index. If
			
 
				+the node fails, you may need to restore an older copy of any lost indices from a
			
 
				+<<modules-snapshots,snapshot>>. Because they are not resilient to any failures,
			
 
				+we do not recommend using one-node clusters in production.
			
 
				+
			
 
				+[[high-availability-cluster-design-two-nodes]]
			
 
				+==== Two-node clusters
			
 
				+
			
 
				+If you have two nodes, we recommend they both be data nodes. You should also
			
 
				+ensure every shard is stored redundantly on both nodes by setting
			
 
				+<<dynamic-index-settings,`index.number_of_replicas`>> to `1` on every index.
			
 
				+This is the default number of replicas but may be overridden by an
			
 
				+<<indices-templates,index template>>. <<dynamic-index-settings,Auto-expand
			
 
				+replicas>> can also achieve the same thing, but it's not necessary to use this
			
 
				+feature in such a small cluster.
			
 
				+
			
 
				+We recommend you set `node.master: false` on one of your two nodes so that it is
			
 
				+not <<master-node,master-eligible>>. This means you can be certain which of your
			
 
				+nodes is the elected master of the cluster. The cluster can tolerate the loss of
			
 
				+the other master-ineligible node. If you don't set `node.master: false` on one
			
 
				+node, both nodes are master-eligible. This means both nodes are required for a
			
 
				+master election. This election will fail if your cluster cannot reliably
			
 
				+tolerate the loss of either node.
			
 
				+
			
 
				+By default, each node is assigned every role. We recommend you assign both nodes
			
 
				+all other roles except master eligibility. If one node fails, the other node can
			
 
				+handle its tasks.
			
 
				+
			
 
				+You should avoid sending client requests to just one of your nodes. If you do
			
 
				+and this node fails, such requests will not receive responses even if the
			
 
				+remaining node is a healthy cluster on its own. Ideally, you should balance your
			
 
				+client requests across both nodes. A good way to do this is to specify the
			
 
				+addresses of both nodes when configuring the client to connect to your cluster.
			
 
				+Alternatively, you can use a resilient load balancer to balance client requests
			
 
				+across the nodes in your cluster.
			
 
				+
			
 
				+Because it's not resilient to failures, we do not recommend deploying a two-node
			
 
				+cluster in production.
			
 
				+
			
 
				+[[high-availability-cluster-design-two-nodes-plus]]
			
 
				+==== Two-node clusters with a tiebreaker
			
 
				+
			
 
				+Because master elections are majority-based, the two-node cluster described
			
 
				+above is tolerant to the loss of one of its nodes but not the
			
 
				+other one. You cannot configure a two-node cluster so that it can tolerate
			
 
				+the loss of _either_ node because this is theoretically impossible. You might
			
 
				+expect that if either node fails then {es} can elect the remaining node as the
			
 
				+master, but it is impossible to tell the difference between the failure of a
			
 
				+remote node and a mere loss of connectivity between the nodes. If both nodes
			
 
				+were capable of running independent elections, a loss of connectivity would
			
 
				+lead to a https://en.wikipedia.org/wiki/Split-brain_(computing)[split-brain
			
 
				+problem] and therefore, data loss. {es} avoids this and
			
 
				+protects your data by electing neither node as master until that node can be
			
 
				+sure that it has the latest cluster state and that there is no other master in
			
 
				+the cluster. This could result in the cluster having no master until
			
 
				+connectivity is restored.
			
 
				+
			
 
				+You can solve this problem by adding a third node and making all three nodes
			
 
				+master-eligible. A <<modules-discovery-quorums,master election>> requires only
			
 
				+two of the three master-eligible nodes. This means the cluster can tolerate the
			
 
				+loss of any single node. This third node acts as a tiebreaker in cases where the
			
 
				+two original nodes are disconnected from each other. You can reduce the resource
			
 
				+requirements of this extra node by making it a <<voting-only-node,dedicated
			
 
				+voting-only master-eligible node>>, also known as a dedicated tiebreaker.
			
 
				+Because it has no other roles, a dedicated tiebreaker does not need to be as
			
 
				+powerful as the other two nodes. It will not perform any searches nor coordinate
			
 
				+any client requests and cannot be elected as the master of the cluster.
			
 
				+
			
 
				+The two original nodes should not be voting-only master-eligible nodes since a
			
 
				+resilient cluster requires at least three master-eligible nodes, at least two
			
 
				+of which are not voting-only master-eligible nodes. If two of your three nodes
			
 
				+are voting-only master-eligible nodes then the elected master must be the third
			
 
				+node. This node then becomes a single point of failure.
			
 
				+
			
 
				+We recommend assigning both non-tiebreaker nodes all other roles. This creates
			
 
				+redundancy by ensuring any task in the cluster can be handled by either node.
			
 
				+
			
 
				+You should not send any client requests to the dedicated tiebreaker node.
			
 
				+You should also avoid sending client requests to just one of the other two
			
 
				+nodes. If you do, and this node fails, then any requests will not
			
 
				+receive responses, even if the remaining nodes form a healthy cluster. Ideally,
			
 
				+you should balance your client requests across both of the non-tiebreaker
			
 
				+nodes. You can do this by specifying the address of both nodes
			
 
				+when configuring your client to connect to your cluster. Alternatively, you can
			
 
				+use a resilient load balancer to balance client requests across the appropriate
			
 
				+nodes in your cluster. The {ess-trial}[Elastic Cloud] service
			
 
				+provides such a load balancer.
			
 
				+
			
 
				+A two-node cluster with an additional tiebreaker node is the smallest possible
			
 
				+cluster that is suitable for production deployments.
			
 
				+
			
 
				+[[high-availability-cluster-design-three-nodes]]
			
 
				+==== Three-node clusters
			
 
				+
			
 
				+If you have three nodes, we recommend they all be <<data-node,data
			
 
				+nodes>> and every index should have at least one replica. Nodes are data nodes
			
 
				+by default. You may prefer for some indices to have two replicas so that each
			
 
				+node has a copy of each shard in those indices. You should also configure each
			
 
				+node to be <<master-node,master-eligible>> so that any two of them can hold a
			
 
				+master election without needing to communicate with the third node. Nodes are
			
 
				+master-eligible by default. This cluster will be resilient to the loss of any
			
 
				+single node.
			
 
				+
			
 
				+You should avoid sending client requests to just one of your nodes. If you do,
			
 
				+and this node fails, then any requests will not receive responses even if the
			
 
				+remaining two nodes form a healthy cluster. Ideally, you should balance your
			
 
				+client requests across all three nodes. You can do this by specifying the
			
 
				+address of multiple nodes when configuring your client to connect to your
			
 
				+cluster. Alternatively you can use a resilient load balancer to balance client
			
 
				+requests across your cluster. The {ess-trial}[Elastic Cloud]
			
 
				+service provides such a load balancer.
			
 
				+
			
 
				+[[high-availability-cluster-design-three-plus-nodes]]
			
 
				+==== Clusters with more than three nodes
			
 
				+
			
 
				+Once your cluster grows to more than three nodes, you can start to specialise
			
 
				+these nodes according to their responsibilities, allowing you to scale their
			
 
				+resources independently as needed. You can have as many <<data-node,data
			
 
				+nodes>>, <<ingest,ingest nodes>>, <<ml-node,{ml} nodes>>, etc. as needed to
			
 
				+support your workload. As your cluster grows larger, we recommend using
			
 
				+dedicated nodes for each role. This lets you to independently scale resources
			
 
				+for each task.
			
 
				+
			
 
				+However, it is good practice to limit the number of master-eligible nodes in
			
 
				+the cluster to three. Master nodes do not scale like other node types since
			
 
				+the cluster always elects just one of them as the master of the cluster. If
			
 
				+there are too many master-eligible nodes then master elections may take a
			
 
				+longer time to complete. In larger clusters, we recommend you
			
 
				+configure some of your nodes as dedicated master-eligible nodes and avoid
			
 
				+sending any client requests to these dedicated nodes. Your cluster may become
			
 
				+unstable if the master-eligible nodes are overwhelmed with unnecessary extra
			
 
				+work that could be handled by one of the other nodes.
			
 
				+
			
 
				+You may configure one of your master-eligible nodes to be a
			
 
				+<<voting-only-node,voting-only node>> so that it can never be elected as the
			
 
				+master node. For instance, you may have two dedicated master nodes and a third
			
 
				+node that is both a data node and a voting-only master-eligible node. This
			
 
				+third voting-only node will act as a tiebreaker in master elections but will
			
 
				+never become the master itself.
			
 
				+
			
 
				+[[high-availability-cluster-design-small-cluster-summary]]
			
 
				+==== Summary
			
 
				+
			
 
				+The cluster will be resilient to the loss of any node as long as:
			
 
				+
			
 
				+- The <<cluster-health,cluster health status>> is `green`.
			
 
				+- There are at least two data nodes. 
			
 
				+- Every index has at least one replica of each shard, in addition to the 
			
 
				+  primary.
			
 
				+- The cluster has at least three master-eligible nodes. At least two of these
			
 
				+  nodes are not voting-only, master-eligible nodes.
			
 
				+- Clients are configured to send their requests to more than one node or are
			
 
				+  configured to use a load balancer that balances the requests across an
			
 
				+  appropriate set of nodes. The {ess-trial}[Elastic Cloud] service provides such
			
 
				+  a load balancer.
			
 
				+
			
 
				+[[high-availability-cluster-design-large-clusters]]
			
 
				+=== Resilience in larger clusters
			
 
				+
			
 
				+It is not unusual for nodes to share some common infrastructure, such as a power
			
 
				+supply or network router. If so, you should plan for the failure of this
			
 
				+infrastructure and ensure that such a failure would not affect too many of your
			
 
				+nodes. It is common practice to group all the nodes sharing some infrastructure
			
 
				+into _zones_ and to plan for the failure of any whole zone at once.
			
 
				+
			
 
				+Your cluster’s zones should all be contained within a single data centre. {es}
			
 
				+expects its node-to-node connections to be reliable and have low latency and
			
 
				+high bandwidth. Connections between data centres typically do not meet these
			
 
				+expectations. Although {es} will behave correctly on an unreliable or slow
			
 
				+network, it will not necessarily behave optimally. It may take a considerable
			
 
				+length of time for a cluster to fully recover from a network partition since it
			
 
				+must resynchronize any missing data and rebalance the cluster once the
			
 
				+partition heals. If you want your data to be available in multiple data centres,
			
 
				+deploy a separate cluster in each data centre and use
			
 
				+<<modules-cross-cluster-search,{ccs}>> or <<xpack-ccr,{ccr}>> to link the
			
 
				+clusters together. These features are designed to perform well even if the
			
 
				+cluster-to-cluster connections are less reliable or slower than the network
			
 
				+within each cluster.
			
 
				+
			
 
				+After losing a whole zone's worth of nodes, a properly-designed cluster may be
			
 
				+functional but running with significantly reduced capacity. You may need
			
 
				+to provision extra nodes to restore acceptable performance in your
			
 
				+cluster when handling such a failure.
			
 
				+
			
 
				+For resilience against whole-zone failures, it is important that there is a copy
			
 
				+of each shard in more than one zone, which can be achieved by placing data
			
 
				+nodes in multiple zones and configuring <<allocation-awareness,shard allocation
			
 
				+awareness>>. You should also ensure that client requests are sent to nodes in
			
 
				+more than one zone.
			
 
				+
			
 
				+You should consider all node roles and ensure that each role is split
			
 
				+redundantly across two or more zones. For instance, if you are using
			
 
				+<<ingest,ingest pipelines>> or {stack-ov}/xpack-ml.html[{ml}],
			
 
				+you should have ingest or {ml} nodes in two or more zones. However,
			
 
				+the placement of master-eligible nodes requires a little more care because a
			
 
				+resilient cluster needs at least two of the three master-eligible nodes in
			
 
				+order to function. The following sections explore the options for placing
			
 
				+master-eligible nodes across multiple zones.
			
 
				+
			
 
				+[[high-availability-cluster-design-two-zones]]
			
 
				+==== Two-zone clusters
			
 
				+
			
 
				+If you have two zones, you should have a different number of
			
 
				+master-eligible nodes in each zone so that the zone with more nodes will
			
 
				+contain a majority of them and will be able to survive the loss of the other
			
 
				+zone. For instance, if you have three master-eligible nodes then you may put
			
 
				+all of them in one zone or you may put two in one zone and the third in the
			
 
				+other zone. You should not place an equal number of master-eligible nodes in
			
 
				+each zone. If you place the same number of master-eligible nodes in each zone,
			
 
				+neither zone has a majority of its own. Therefore, the cluster may not survive
			
 
				+the loss of either zone.
			
 
				+
			
 
				+[[high-availability-cluster-design-two-zones-plus]]
			
 
				+==== Two-zone clusters with a tiebreaker
			
 
				+
			
 
				+The two-zone deployment described above is tolerant to the loss of one of its
			
 
				+zones but not to the loss of the other one because master elections are
			
 
				+majority-based. You cannot configure a two-zone cluster so that it can tolerate
			
 
				+the loss of _either_ zone because this is theoretically impossible. You might
			
 
				+expect that if either zone fails then {es} can elect a node from the remaining
			
 
				+zone as the master but it is impossible to tell the difference between the
			
 
				+failure of a remote zone and a mere loss of connectivity between the zones. If
			
 
				+both zones were capable of running independent elections then a loss of
			
 
				+connectivity would lead to a
			
 
				+https://en.wikipedia.org/wiki/Split-brain_(computing)[split-brain problem] and
			
 
				+therefore data loss. {es} avoids this and protects your data by not electing
			
 
				+a node from either zone as master until that node can be sure that it has the
			
 
				+latest cluster state and that there is no other master in the cluster. This may
			
 
				+mean there is no master at all until connectivity is restored.
			
 
				+
			
 
				+You can solve this by placing one master-eligible node in each of your two
			
 
				+zones and adding a single extra master-eligible node in an independent third
			
 
				+zone. The extra master-eligible node acts as a tiebreaker in cases
			
 
				+where the two original zones are disconnected from each other. The extra
			
 
				+tiebreaker node should be a <<voting-only-node,dedicated voting-only
			
 
				+master-eligible node>>, also known as a dedicated tiebreaker. A dedicated
			
 
				+tiebreaker need not be as powerful as the other two nodes since it has no other
			
 
				+roles and will not perform any searches nor coordinate any client requests nor
			
 
				+be elected as the master of the cluster.
			
 
				+
			
 
				+You should use <<allocation-awareness,shard allocation awareness>> to ensure
			
 
				+that there is a copy of each shard in each zone. This means either zone remains
			
 
				+fully available if the other zone fails.
			
 
				+
			
 
				+All master-eligible nodes, including voting-only nodes, are on the critical path
			
 
				+for publishing cluster state updates. Because of this, these nodes require
			
 
				+reasonably fast persistent storage and a reliable, low-latency network
			
 
				+connection to the rest of the cluster. If you add a tiebreaker node in a third
			
 
				+independent zone then you must make sure it has adequate resources and good
			
 
				+connectivity to the rest of the cluster.
			
 
				+
			
 
				+[[high-availability-cluster-design-three-zones]]
			
 
				+==== Clusters with three or more zones
			
 
				+
			
 
				+If you have three zones then you should have one master-eligible node in each
			
 
				+zone. If you have more than three zones then you should choose three of the
			
 
				+zones and put a master-eligible node in each of these three zones. This will
			
 
				+mean that the cluster can still elect a master even if one of the zones fails.
			
 
				+
			
 
				+As always, your indices should have at least one replica in case a node fails.
			
 
				+You should also use <<allocation-awareness,shard allocation awareness>> to
			
 
				+limit the number of copies of each shard in each zone. For instance, if you have
			
 
				+an index with one or two replicas configured then allocation awareness will
			
 
				+ensure that the replicas of the shard are in a different zone from the primary.
			
 
				+This means that a copy of every shard will still be available if one zone
			
 
				+fails. The availability of this shard will not be affected by such a
			
 
				+failure.
			
 
				+
			
 
				+[[high-availability-cluster-design-large-cluster-summary]]
			
 
				+==== Summary
			
 
				+
			
 
				+The cluster will be resilient to the loss of any zone as long as:
			
 
				+
			
 
				+- The <<cluster-health,cluster health status>> is `green`.
			
 
				+- There are at least two zones containing data nodes.
			
 
				+- Every index has at least one replica of each shard, in addition to the 
			
 
				+  primary.
			
 
				+- Shard allocation awareness is configured to avoid concentrating all copies of
			
 
				+  a shard within a single zone.
			
 
				+- The cluster has at least three master-eligible nodes. At least two of these 
			
 
				+  nodes are not voting-only master-eligible nodes, spread evenly across at least
			
 
				+  three zones.
			
 
				+- Clients are configured to send their requests to nodes in more than one zone
			
 
				+  or are configured to use a load balancer that balances the requests across an
			
 
				+  appropriate set of nodes. The {ess-trial}[Elastic Cloud] service provides such
			
 
				+  a load balancer.