123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960 |
- [[cluster-state-publishing]]
- === Publishing the cluster state
- The elected master node is the only node in a cluster that can make changes to
- the cluster state. The elected master node processes one batch of cluster state
- updates at a time, computing the required changes and publishing the updated
- cluster state to all the other nodes in the cluster. Each publication starts
- with the elected master broadcasting the updated cluster state to all nodes in
- the cluster. Each node responds with an acknowledgement but does not yet apply
- the newly-received state. Once the elected master has collected
- acknowledgements from enough master-eligible nodes, the new cluster state is
- said to be _committed_ and the master broadcasts another message instructing
- nodes to apply the now-committed state. Each node receives this message,
- applies the updated state, and then sends a second acknowledgement back to the
- master.
- The elected master allows a limited amount of time for each cluster state
- update to be completely published to all nodes. It is defined by the
- `cluster.publish.timeout` setting, which defaults to `30s`, measured from the
- time the publication started. If this time is reached before the new cluster
- state is committed then the cluster state change is rejected and the elected
- master considers itself to have failed. It stands down and starts trying to
- elect a new master node.
- If the new cluster state is committed before `cluster.publish.timeout` has
- elapsed, the elected master node considers the change to have succeeded. It
- waits until the timeout has elapsed or until it has received acknowledgements
- that each node in the cluster has applied the updated state, and then starts
- processing and publishing the next cluster state update. If some
- acknowledgements have not been received (i.e. some nodes have not yet confirmed
- that they have applied the current update), these nodes are said to be
- _lagging_ since their cluster states have fallen behind the elected master's
- latest state. The elected master waits for the lagging nodes to catch up for a
- further time, `cluster.follower_lag.timeout`, which defaults to `90s`. If a
- node has still not successfully applied the cluster state update within this
- time then it is considered to have failed and the elected master removes it
- from the cluster.
- Cluster state updates are typically published as diffs to the previous cluster
- state, which reduces the time and network bandwidth needed to publish a cluster
- state update. For example, when updating the mappings for only a subset of the
- indices in the cluster state, only the updates for those indices need to be
- published to the nodes in the cluster, as long as those nodes have the previous
- cluster state. If a node is missing the previous cluster state, for example
- when rejoining a cluster, the elected master will publish the full cluster
- state to that node so that it can receive future updates as diffs.
- NOTE: {es} is a peer to peer based system, in which nodes communicate with one
- another directly. The high-throughput APIs (index, delete, search) do not
- normally interact with the elected master node. The responsibility of the
- elected master node is to maintain the global cluster state which includes
- reassigning shards when nodes join or leave the cluster. Each time the cluster
- state is changed, the new state is published to all nodes in the cluster as
- described above.
- The performance characteristics of cluster state updates are a function of the
- speed of the storage on each master-eligible node, as well as the reliability
- and latency of the network interconnections between all nodes in the cluster.
- You must therefore ensure that the storage and networking available to the
- nodes in your cluster are good enough to meet your performance goals.
|