123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051 |
- [[cluster-state-publishing]]
- === Publishing the cluster state
- The master node is the only node in a cluster that can make changes to the
- cluster state. The master node processes one batch of cluster state updates at
- a time, computing the required changes and publishing the updated cluster state
- to all the other nodes in the cluster. Each publication starts with the master
- broadcasting the updated cluster state to all nodes in the cluster. Each node
- responds with an acknowledgement but does not yet apply the newly-received
- state. Once the master has collected acknowledgements from enough
- master-eligible nodes, the new cluster state is said to be _committed_ and the
- master broadcasts another message instructing nodes to apply the now-committed
- state. Each node receives this message, applies the updated state, and then
- sends a second acknowledgement back to the master.
- The master allows a limited amount of time for each cluster state update to be
- completely published to all nodes. It is defined by the
- `cluster.publish.timeout` setting, which defaults to `30s`, measured from the
- time the publication started. If this time is reached before the new cluster
- state is committed then the cluster state change is rejected and the master
- considers itself to have failed. It stands down and starts trying to elect a
- new master.
- If the new cluster state is committed before `cluster.publish.timeout` has
- elapsed, the master node considers the change to have succeeded. It waits until
- the timeout has elapsed or until it has received acknowledgements that each
- node in the cluster has applied the updated state, and then starts processing
- and publishing the next cluster state update. If some acknowledgements have not
- been received (i.e. some nodes have not yet confirmed that they have applied
- the current update), these nodes are said to be _lagging_ since their cluster
- states have fallen behind the master's latest state. The master waits for the
- lagging nodes to catch up for a further time, `cluster.follower_lag.timeout`,
- which defaults to `90s`. If a node has still not successfully applied the
- cluster state update within this time then it is considered to have failed and
- is removed from the cluster.
- Cluster state updates are typically published as diffs to the previous cluster
- state, which reduces the time and network bandwidth needed to publish a cluster
- state update. For example, when updating the mappings for only a subset of the
- indices in the cluster state, only the updates for those indices need to be
- published to the nodes in the cluster, as long as those nodes have the previous
- cluster state. If a node is missing the previous cluster state, for example
- when rejoining a cluster, the master will publish the full cluster state to
- that node so that it can receive future updates as diffs.
- NOTE: {es} is a peer to peer based system, in which nodes communicate with one
- another directly. The high-throughput APIs (index, delete, search) do not
- normally interact with the master node. The responsibility of the master node
- is to maintain the global cluster state and reassign shards when nodes join or
- leave the cluster. Each time the cluster state is changed, the new state is
- published to all nodes in the cluster as described above.
|