|
|
@@ -1,13 +1,12 @@
|
|
|
[[modules-discovery-zen]]
|
|
|
=== Zen Discovery
|
|
|
|
|
|
-The zen discovery is the built in discovery module for Elasticsearch and
|
|
|
-the default. It provides unicast discovery, but can be extended to
|
|
|
-support cloud environments and other forms of discovery.
|
|
|
+Zen discovery is the built-in, default, discovery module for Elasticsearch. It
|
|
|
+provides unicast and file-based discovery, and can be extended to support cloud
|
|
|
+environments and other forms of discovery via plugins.
|
|
|
|
|
|
-The zen discovery is integrated with other modules, for example, all
|
|
|
-communication between nodes is done using the
|
|
|
-<<modules-transport,transport>> module.
|
|
|
+Zen discovery is integrated with other modules, for example, all communication
|
|
|
+between nodes is done using the <<modules-transport,transport>> module.
|
|
|
|
|
|
It is separated into several sub modules, which are explained below:
|
|
|
|
|
|
@@ -15,86 +14,159 @@ It is separated into several sub modules, which are explained below:
|
|
|
[[ping]]
|
|
|
==== Ping
|
|
|
|
|
|
-This is the process where a node uses the discovery mechanisms to find
|
|
|
-other nodes.
|
|
|
+This is the process where a node uses the discovery mechanisms to find other
|
|
|
+nodes.
|
|
|
+
|
|
|
+[float]
|
|
|
+[[discovery-seed-nodes]]
|
|
|
+==== Seed nodes
|
|
|
+
|
|
|
+Zen discovery uses a list of _seed_ nodes in order to start off the discovery
|
|
|
+process. At startup, or when electing a new master, Elasticsearch tries to
|
|
|
+connect to each seed node in its list, and holds a gossip-like conversation with
|
|
|
+them to find other nodes and to build a complete picture of the cluster. By
|
|
|
+default there are two methods for configuring the list of seed nodes: _unicast_
|
|
|
+and _file-based_. It is recommended that the list of seed nodes comprises the
|
|
|
+list of master-eligible nodes in the cluster.
|
|
|
|
|
|
[float]
|
|
|
[[unicast]]
|
|
|
===== Unicast
|
|
|
|
|
|
-Unicast discovery requires a list of hosts to use that will act as gossip
|
|
|
-routers. These hosts can be specified as hostnames or IP addresses; hosts
|
|
|
-specified as hostnames are resolved to IP addresses during each round of
|
|
|
-pinging. Note that if you are in an environment where DNS resolutions vary with
|
|
|
-time, you might need to adjust your <<networkaddress-cache-ttl,JVM security
|
|
|
-settings>>.
|
|
|
+Unicast discovery configures a static list of hosts for use as seed nodes.
|
|
|
+These hosts can be specified as hostnames or IP addresses; hosts specified as
|
|
|
+hostnames are resolved to IP addresses during each round of pinging. Note that
|
|
|
+if you are in an environment where DNS resolutions vary with time, you might
|
|
|
+need to adjust your <<networkaddress-cache-ttl,JVM security settings>>.
|
|
|
|
|
|
-It is recommended that the unicast hosts list be maintained as the list of
|
|
|
-master-eligible nodes in the cluster.
|
|
|
+The list of hosts is set using the `discovery.zen.ping.unicast.hosts` static
|
|
|
+setting. This is either an array of hosts or a comma-delimited string. Each
|
|
|
+value should be in the form of `host:port` or `host` (where `port` defaults to
|
|
|
+the setting `transport.profiles.default.port` falling back to
|
|
|
+`transport.tcp.port` if not set). Note that IPv6 hosts must be bracketed. The
|
|
|
+default for this setting is `127.0.0.1, [::1]`
|
|
|
|
|
|
-Unicast discovery provides the following settings with the `discovery.zen.ping.unicast` prefix:
|
|
|
+Additionally, the `discovery.zen.ping.unicast.resolve_timeout` configures the
|
|
|
+amount of time to wait for DNS lookups on each round of pinging. This is
|
|
|
+specified as a <<time-units, time unit>> and defaults to 5s.
|
|
|
|
|
|
-[cols="<,<",options="header",]
|
|
|
-|=======================================================================
|
|
|
-|Setting |Description
|
|
|
-|`hosts` |Either an array setting or a comma delimited setting. Each
|
|
|
- value should be in the form of `host:port` or `host` (where `port` defaults to the setting `transport.profiles.default.port`
|
|
|
- falling back to `transport.tcp.port` if not set). Note that IPv6 hosts must be bracketed. Defaults to `127.0.0.1, [::1]`
|
|
|
-|`hosts.resolve_timeout` |The amount of time to wait for DNS lookups on each round of pinging. Specified as
|
|
|
-<<time-units, time units>>. Defaults to 5s.
|
|
|
-|=======================================================================
|
|
|
+Unicast discovery uses the <<modules-transport,transport>> module to perform the
|
|
|
+discovery.
|
|
|
|
|
|
-The unicast discovery uses the <<modules-transport,transport>> module to perform the discovery.
|
|
|
+[float]
|
|
|
+[[file-based-hosts-provider]]
|
|
|
+===== File-based
|
|
|
+
|
|
|
+In addition to hosts provided by the static `discovery.zen.ping.unicast.hosts`
|
|
|
+setting, it is possible to provide a list of hosts via an external file.
|
|
|
+Elasticsearch reloads this file when it changes, so that the list of seed nodes
|
|
|
+can change dynamically without needing to restart each node. For example, this
|
|
|
+gives a convenient mechanism for an Elasticsearch instance that is run in a
|
|
|
+Docker container to be dynamically supplied with a list of IP addresses to
|
|
|
+connect to for Zen discovery when those IP addresses may not be known at node
|
|
|
+startup.
|
|
|
+
|
|
|
+To enable file-based discovery, configure the `file` hosts provider as follows:
|
|
|
+
|
|
|
+```
|
|
|
+discovery.zen.hosts_provider: file
|
|
|
+```
|
|
|
+
|
|
|
+Then create a file at `$ES_PATH_CONF/unicast_hosts.txt` in
|
|
|
+<<discovery-file-format,the format described below>>. Any time a change is made
|
|
|
+to the `unicast_hosts.txt` file the new changes will be picked up by
|
|
|
+Elasticsearch and the new hosts list will be used.
|
|
|
+
|
|
|
+Note that the file-based discovery plugin augments the unicast hosts list in
|
|
|
+`elasticsearch.yml`: if there are valid unicast host entries in
|
|
|
+`discovery.zen.ping.unicast.hosts` then they will be used in addition to those
|
|
|
+supplied in `unicast_hosts.txt`.
|
|
|
+
|
|
|
+The `discovery.zen.ping.unicast.resolve_timeout` setting also applies to DNS
|
|
|
+lookups for nodes specified by address via file-based discovery. This is
|
|
|
+specified as a <<time-units, time unit>> and defaults to 5s.
|
|
|
+
|
|
|
+[[discovery-file-format]]
|
|
|
+[float]
|
|
|
+====== unicast_hosts.txt file format
|
|
|
+
|
|
|
+The format of the file is to specify one node entry per line. Each node entry
|
|
|
+consists of the host (host name or IP address) and an optional transport port
|
|
|
+number. If the port number is specified, is must come immediately after the
|
|
|
+host (on the same line) separated by a `:`. If the port number is not
|
|
|
+specified, a default value of 9300 is used.
|
|
|
+
|
|
|
+For example, this is an example of `unicast_hosts.txt` for a cluster with four
|
|
|
+nodes that participate in unicast discovery, some of which are not running on
|
|
|
+the default port:
|
|
|
+
|
|
|
+[source,txt]
|
|
|
+----------------------------------------------------------------
|
|
|
+10.10.10.5
|
|
|
+10.10.10.6:9305
|
|
|
+10.10.10.5:10005
|
|
|
+# an IPv6 address
|
|
|
+[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:9301
|
|
|
+----------------------------------------------------------------
|
|
|
+
|
|
|
+Host names are allowed instead of IP addresses (similar to
|
|
|
+`discovery.zen.ping.unicast.hosts`), and IPv6 addresses must be specified in
|
|
|
+brackets with the port coming after the brackets.
|
|
|
+
|
|
|
+It is also possible to add comments to this file. All comments must appear on
|
|
|
+their lines starting with `#` (i.e. comments cannot start in the middle of a
|
|
|
+line).
|
|
|
|
|
|
[float]
|
|
|
[[master-election]]
|
|
|
==== Master Election
|
|
|
|
|
|
-As part of the ping process a master of the cluster is either
|
|
|
-elected or joined to. This is done automatically. The
|
|
|
-`discovery.zen.ping_timeout` (which defaults to `3s`) determines how long the node
|
|
|
-will wait before deciding on starting an election or joining an existing cluster.
|
|
|
-Three pings will be sent over this timeout interval. In case where no decision can be
|
|
|
-reached after the timeout, the pinging process restarts.
|
|
|
-In slow or congested networks, three seconds might not be enough for a node to become
|
|
|
-aware of the other nodes in its environment before making an election decision.
|
|
|
-Increasing the timeout should be done with care in that case, as it will slow down the
|
|
|
-election process.
|
|
|
-Once a node decides to join an existing formed cluster, it
|
|
|
-will send a join request to the master (`discovery.zen.join_timeout`)
|
|
|
-with a timeout defaulting at 20 times the ping timeout.
|
|
|
-
|
|
|
-When the master node stops or has encountered a problem, the cluster nodes
|
|
|
-start pinging again and will elect a new master. This pinging round also
|
|
|
-serves as a protection against (partial) network failures where a node may unjustly
|
|
|
-think that the master has failed. In this case the node will simply hear from
|
|
|
-other nodes about the currently active master.
|
|
|
-
|
|
|
-If `discovery.zen.master_election.ignore_non_master_pings` is `true`, pings from nodes that are not master
|
|
|
-eligible (nodes where `node.master` is `false`) are ignored during master election; the default value is
|
|
|
+As part of the ping process a master of the cluster is either elected or joined
|
|
|
+to. This is done automatically. The `discovery.zen.ping_timeout` (which defaults
|
|
|
+to `3s`) determines how long the node will wait before deciding on starting an
|
|
|
+election or joining an existing cluster. Three pings will be sent over this
|
|
|
+timeout interval. In case where no decision can be reached after the timeout,
|
|
|
+the pinging process restarts. In slow or congested networks, three seconds
|
|
|
+might not be enough for a node to become aware of the other nodes in its
|
|
|
+environment before making an election decision. Increasing the timeout should
|
|
|
+be done with care in that case, as it will slow down the election process. Once
|
|
|
+a node decides to join an existing formed cluster, it will send a join request
|
|
|
+to the master (`discovery.zen.join_timeout`) with a timeout defaulting at 20
|
|
|
+times the ping timeout.
|
|
|
+
|
|
|
+When the master node stops or has encountered a problem, the cluster nodes start
|
|
|
+pinging again and will elect a new master. This pinging round also serves as a
|
|
|
+protection against (partial) network failures where a node may unjustly think
|
|
|
+that the master has failed. In this case the node will simply hear from other
|
|
|
+nodes about the currently active master.
|
|
|
+
|
|
|
+If `discovery.zen.master_election.ignore_non_master_pings` is `true`, pings from
|
|
|
+nodes that are not master eligible (nodes where `node.master` is `false`) are
|
|
|
+ignored during master election; the default value is `false`.
|
|
|
+
|
|
|
+Nodes can be excluded from becoming a master by setting `node.master` to
|
|
|
`false`.
|
|
|
|
|
|
-Nodes can be excluded from becoming a master by setting `node.master` to `false`.
|
|
|
-
|
|
|
-The `discovery.zen.minimum_master_nodes` sets the minimum
|
|
|
-number of master eligible nodes that need to join a newly elected master in order for an election to
|
|
|
-complete and for the elected node to accept its mastership. The same setting controls the minimum number of
|
|
|
-active master eligible nodes that should be a part of any active cluster. If this requirement is not met the
|
|
|
-active master node will step down and a new master election will begin.
|
|
|
+The `discovery.zen.minimum_master_nodes` sets the minimum number of master
|
|
|
+eligible nodes that need to join a newly elected master in order for an election
|
|
|
+to complete and for the elected node to accept its mastership. The same setting
|
|
|
+controls the minimum number of active master eligible nodes that should be a
|
|
|
+part of any active cluster. If this requirement is not met the active master
|
|
|
+node will step down and a new master election will begin.
|
|
|
|
|
|
This setting must be set to a <<minimum_master_nodes,quorum>> of your master
|
|
|
eligible nodes. It is recommended to avoid having only two master eligible
|
|
|
-nodes, since a quorum of two is two. Therefore, a loss of either master
|
|
|
-eligible node will result in an inoperable cluster.
|
|
|
+nodes, since a quorum of two is two. Therefore, a loss of either master eligible
|
|
|
+node will result in an inoperable cluster.
|
|
|
|
|
|
[float]
|
|
|
[[fault-detection]]
|
|
|
==== Fault Detection
|
|
|
|
|
|
-There are two fault detection processes running. The first is by the
|
|
|
-master, to ping all the other nodes in the cluster and verify that they
|
|
|
-are alive. And on the other end, each node pings to master to verify if
|
|
|
-its still alive or an election process needs to be initiated.
|
|
|
+There are two fault detection processes running. The first is by the master, to
|
|
|
+ping all the other nodes in the cluster and verify that they are alive. And on
|
|
|
+the other end, each node pings to master to verify if its still alive or an
|
|
|
+election process needs to be initiated.
|
|
|
|
|
|
The following settings control the fault detection process using the
|
|
|
`discovery.zen.fd` prefix:
|
|
|
@@ -116,19 +188,21 @@ considered failed. Defaults to `3`.
|
|
|
|
|
|
The master node is the only node in a cluster that can make changes to the
|
|
|
cluster state. The master node processes one cluster state update at a time,
|
|
|
-applies the required changes and publishes the updated cluster state to all
|
|
|
-the other nodes in the cluster. Each node receives the publish message, acknowledges
|
|
|
-it, but does *not* yet apply it. If the master does not receive acknowledgement from
|
|
|
-at least `discovery.zen.minimum_master_nodes` nodes within a certain time (controlled by
|
|
|
-the `discovery.zen.commit_timeout` setting and defaults to 30 seconds) the cluster state
|
|
|
-change is rejected.
|
|
|
-
|
|
|
-Once enough nodes have responded, the cluster state is committed and a message will
|
|
|
-be sent to all the nodes. The nodes then proceed to apply the new cluster state to their
|
|
|
-internal state. The master node waits for all nodes to respond, up to a timeout, before
|
|
|
-going ahead processing the next updates in the queue. The `discovery.zen.publish_timeout` is
|
|
|
-set by default to 30 seconds and is measured from the moment the publishing started. Both
|
|
|
-timeout settings can be changed dynamically through the <<cluster-update-settings,cluster update settings api>>
|
|
|
+applies the required changes and publishes the updated cluster state to all the
|
|
|
+other nodes in the cluster. Each node receives the publish message, acknowledges
|
|
|
+it, but does *not* yet apply it. If the master does not receive acknowledgement
|
|
|
+from at least `discovery.zen.minimum_master_nodes` nodes within a certain time
|
|
|
+(controlled by the `discovery.zen.commit_timeout` setting and defaults to 30
|
|
|
+seconds) the cluster state change is rejected.
|
|
|
+
|
|
|
+Once enough nodes have responded, the cluster state is committed and a message
|
|
|
+will be sent to all the nodes. The nodes then proceed to apply the new cluster
|
|
|
+state to their internal state. The master node waits for all nodes to respond,
|
|
|
+up to a timeout, before going ahead processing the next updates in the queue.
|
|
|
+The `discovery.zen.publish_timeout` is set by default to 30 seconds and is
|
|
|
+measured from the moment the publishing started. Both timeout settings can be
|
|
|
+changed dynamically through the <<cluster-update-settings,cluster update
|
|
|
+settings api>>
|
|
|
|
|
|
[float]
|
|
|
[[no-master-block]]
|
|
|
@@ -143,10 +217,14 @@ rejected when there is no active master.
|
|
|
The `discovery.zen.no_master_block` setting has two valid options:
|
|
|
|
|
|
[horizontal]
|
|
|
-`all`:: All operations on the node--i.e. both read & writes--will be rejected. This also applies for api cluster state
|
|
|
-read or write operations, like the get index settings, put mapping and cluster state api.
|
|
|
-`write`:: (default) Write operations will be rejected. Read operations will succeed, based on the last known cluster configuration.
|
|
|
-This may result in partial reads of stale data as this node may be isolated from the rest of the cluster.
|
|
|
-
|
|
|
-The `discovery.zen.no_master_block` setting doesn't apply to nodes-based apis (for example cluster stats, node info and
|
|
|
-node stats apis). Requests to these apis will not be blocked and can run on any available node.
|
|
|
+`all`:: All operations on the node--i.e. both read & writes--will be rejected.
|
|
|
+This also applies for api cluster state read or write operations, like the get
|
|
|
+index settings, put mapping and cluster state api.
|
|
|
+`write`:: (default) Write operations will be rejected. Read operations will
|
|
|
+succeed, based on the last known cluster configuration. This may result in
|
|
|
+partial reads of stale data as this node may be isolated from the rest of the
|
|
|
+cluster.
|
|
|
+
|
|
|
+The `discovery.zen.no_master_block` setting doesn't apply to nodes-based apis
|
|
|
+(for example cluster stats, node info and node stats apis). Requests to these
|
|
|
+apis will not be blocked and can run on any available node.
|