123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453 |
- [[modules-node]]
- == Node
- Any time that you start an instance of Elasticsearch, you are starting a
- _node_. A collection of connected nodes is called a
- <<modules-cluster,cluster>>. If you are running a single node of Elasticsearch,
- then you have a cluster of one node.
- Every node in the cluster can handle <<modules-http,HTTP>> and
- <<modules-transport,Transport>> traffic by default. The transport layer
- is used exclusively for communication between nodes; the HTTP layer is
- used by REST clients.
- All nodes know about all the other nodes in the cluster and can forward client
- requests to the appropriate node.
- By default, a node is all of the following types: master-eligible, data, ingest,
- and machine learning (if available).
- TIP: As the cluster grows and in particular if you have large {ml} jobs,
- consider separating dedicated master-eligible nodes from dedicated data nodes
- and dedicated {ml} nodes.
- <<master-node,Master-eligible node>>::
- A node that has `node.master` set to `true` (default), which makes it eligible
- to be <<modules-discovery,elected as the _master_ node>>, which controls
- the cluster.
- <<data-node,Data node>>::
- A node that has `node.data` set to `true` (default). Data nodes hold data and
- perform data related operations such as CRUD, search, and aggregations.
- <<ingest,Ingest node>>::
- A node that has `node.ingest` set to `true` (default). Ingest nodes are able
- to apply an <<pipeline,ingest pipeline>> to a document in order to transform
- and enrich the document before indexing. With a heavy ingest load, it makes
- sense to use dedicated ingest nodes and to mark the master and data nodes as
- `node.ingest: false`.
- <<ml-node,Machine learning node>>::
- A node that has `xpack.ml.enabled` and `node.ml` set to `true`, which is the
- default behavior in the {es} {default-dist}. If you want to use {ml-features},
- there must be at least one {ml} node in your cluster. For more information about
- {ml-features}, see
- {stack-ov}/xpack-ml.html[Machine learning in the {stack}].
- +
- IMPORTANT: If you use the {oss-dist}, do not set `node.ml`. Otherwise, the node
- fails to start.
- [NOTE]
- [[coordinating-node]]
- .Coordinating node
- ===============================================
- Requests like search requests or bulk-indexing requests may involve data held
- on different data nodes. A search request, for example, is executed in two
- phases which are coordinated by the node which receives the client request --
- the _coordinating node_.
- In the _scatter_ phase, the coordinating node forwards the request to the data
- nodes which hold the data. Each data node executes the request locally and
- returns its results to the coordinating node. In the _gather_ phase, the
- coordinating node reduces each data node's results into a single global
- resultset.
- Every node is implicitly a coordinating node. This means that a node that has
- all three `node.master`, `node.data` and `node.ingest` set to `false` will
- only act as a coordinating node, which cannot be disabled. As a result, such
- a node needs to have enough memory and CPU in order to deal with the gather
- phase.
- ===============================================
- [float]
- [[master-node]]
- === Master Eligible Node
- The master node is responsible for lightweight cluster-wide actions such as
- creating or deleting an index, tracking which nodes are part of the cluster,
- and deciding which shards to allocate to which nodes. It is important for
- cluster health to have a stable master node.
- Any master-eligible node that is not a <<voting-only-node,voting-only node>> may
- be elected to become the master node by the <<modules-discovery,master election
- process>>.
- IMPORTANT: Master nodes must have access to the `data/` directory (just like
- `data` nodes) as this is where the cluster state is persisted between node restarts.
- Indexing and searching your data is CPU-, memory-, and I/O-intensive work
- which can put pressure on a node's resources. To ensure that your master
- node is stable and not under pressure, it is a good idea in a bigger
- cluster to split the roles between dedicated master-eligible nodes and
- dedicated data nodes.
- While master nodes can also behave as <<coordinating-node,coordinating nodes>>
- and route search and indexing requests from clients to data nodes, it is
- better _not_ to use dedicated master nodes for this purpose. It is important
- for the stability of the cluster that master-eligible nodes do as little work
- as possible.
- To create a dedicated master-eligible node in the {default-dist}, set:
- [source,yaml]
- -------------------
- node.master: true <1>
- node.voting_only: false <2>
- node.data: false <3>
- node.ingest: false <4>
- node.ml: false <5>
- xpack.ml.enabled: true <6>
- cluster.remote.connect: false <7>
- -------------------
- <1> The `node.master` role is enabled by default.
- <2> The `node.voting_only` role is disabled by default.
- <3> Disable the `node.data` role (enabled by default).
- <4> Disable the `node.ingest` role (enabled by default).
- <5> Disable the `node.ml` role (enabled by default).
- <6> The `xpack.ml.enabled` setting is enabled by default.
- <7> Disable {ccs} (enabled by default).
- To create a dedicated master-eligible node in the {oss-dist}, set:
- [source,yaml]
- -------------------
- node.master: true <1>
- node.data: false <2>
- node.ingest: false <3>
- cluster.remote.connect: false <4>
- -------------------
- <1> The `node.master` role is enabled by default.
- <2> Disable the `node.data` role (enabled by default).
- <3> Disable the `node.ingest` role (enabled by default).
- <4> Disable {ccs} (enabled by default).
- [float]
- [[voting-only-node]]
- ==== Voting-only master-eligible node
- A voting-only master-eligible node is a node that participates in
- <<modules-discovery,master elections>> but which will not act as the cluster's
- elected master node. In particular, a voting-only node can serve as a tiebreaker
- in elections.
- It may seem confusing to use the term "master-eligible" to describe a
- voting-only node since such a node is not actually eligible to become the master
- at all. This terminology is an unfortunate consequence of history:
- master-eligible nodes are those nodes that participate in elections and perform
- certain tasks during cluster state publications, and voting-only nodes have the
- same responsibilities even if they can never become the elected master.
- To configure a master-eligible node as a voting-only node, set the following
- setting:
- [source,yaml]
- -------------------
- node.voting_only: true <1>
- -------------------
- <1> The default for `node.voting_only` is `false`.
- IMPORTANT: The `voting_only` role requires the {default-dist} of Elasticsearch
- and is not supported in the {oss-dist}. If you use the {oss-dist} and set
- `node.voting_only` then the node will fail to start. Also note that only
- master-eligible nodes can be marked as voting-only.
- High availability (HA) clusters require at least three master-eligible nodes, at
- least two of which are not voting-only nodes. Such a cluster will be able to
- elect a master node even if one of the nodes fails.
- Since voting-only nodes never act as the cluster's elected master, they may
- require require less heap and a less powerful CPU than the true master nodes.
- However all master-eligible nodes, including voting-only nodes, require
- reasonably fast persistent storage and a reliable and low-latency network
- connection to the rest of the cluster, since they are on the critical path for
- <<cluster-state-publishing,publishing cluster state updates>>.
- Voting-only master-eligible nodes may also fill other roles in your cluster.
- For instance, a node may be both a data node and a voting-only master-eligible
- node. A _dedicated_ voting-only master-eligible nodes is a voting-only
- master-eligible node that fills no other roles in the cluster. To create a
- dedicated voting-only master-eligible node in the {default-dist}, set:
- [source,yaml]
- -------------------
- node.master: true <1>
- node.voting_only: true <2>
- node.data: false <3>
- node.ingest: false <4>
- node.ml: false <5>
- xpack.ml.enabled: true <6>
- cluster.remote.connect: false <7>
- -------------------
- <1> The `node.master` role is enabled by default.
- <2> Enable the `node.voting_only` role (disabled by default).
- <3> Disable the `node.data` role (enabled by default).
- <4> Disable the `node.ingest` role (enabled by default).
- <5> Disable the `node.ml` role (enabled by default).
- <6> The `xpack.ml.enabled` setting is enabled by default.
- <7> Disable {ccs} (enabled by default).
- [float]
- [[data-node]]
- === Data Node
- Data nodes hold the shards that contain the documents you have indexed. Data
- nodes handle data related operations like CRUD, search, and aggregations.
- These operations are I/O-, memory-, and CPU-intensive. It is important to
- monitor these resources and to add more data nodes if they are overloaded.
- The main benefit of having dedicated data nodes is the separation of the
- master and data roles.
- To create a dedicated data node in the {default-dist}, set:
- [source,yaml]
- -------------------
- node.master: false <1>
- node.voting_only: false <2>
- node.data: true <3>
- node.ingest: false <4>
- node.ml: false <5>
- cluster.remote.connect: false <6>
- -------------------
- <1> Disable the `node.master` role (enabled by default).
- <2> The `node.voting_only` role is disabled by default.
- <3> The `node.data` role is enabled by default.
- <4> Disable the `node.ingest` role (enabled by default).
- <5> Disable the `node.ml` role (enabled by default).
- <6> Disable {ccs} (enabled by default).
- To create a dedicated data node in the {oss-dist}, set:
- [source,yaml]
- -------------------
- node.master: false <1>
- node.data: true <2>
- node.ingest: false <3>
- cluster.remote.connect: false <4>
- -------------------
- <1> Disable the `node.master` role (enabled by default).
- <2> The `node.data` role is enabled by default.
- <3> Disable the `node.ingest` role (enabled by default).
- <4> Disable {ccs} (enabled by default).
- [float]
- [[node-ingest-node]]
- === Ingest Node
- Ingest nodes can execute pre-processing pipelines, composed of one or more
- ingest processors. Depending on the type of operations performed by the ingest
- processors and the required resources, it may make sense to have dedicated
- ingest nodes, that will only perform this specific task.
- To create a dedicated ingest node in the {default-dist}, set:
- [source,yaml]
- -------------------
- node.master: false <1>
- node.voting_only: false <2>
- node.data: false <3>
- node.ingest: true <4>
- node.ml: false <5>
- cluster.remote.connect: false <6>
- -------------------
- <1> Disable the `node.master` role (enabled by default).
- <2> The `node.voting_only` role is disabled by default.
- <3> Disable the `node.data` role (enabled by default).
- <4> The `node.ingest` role is enabled by default.
- <5> Disable the `node.ml` role (enabled by default).
- <6> Disable {ccs} (enabled by default).
- To create a dedicated ingest node in the {oss-dist}, set:
- [source,yaml]
- -------------------
- node.master: false <1>
- node.data: false <2>
- node.ingest: true <3>
- cluster.remote.connect: false <4>
- -------------------
- <1> Disable the `node.master` role (enabled by default).
- <2> Disable the `node.data` role (enabled by default).
- <3> The `node.ingest` role is enabled by default.
- <4> Disable {ccs} (enabled by default).
- [float]
- [[coordinating-only-node]]
- === Coordinating only node
- If you take away the ability to be able to handle master duties, to hold data,
- and pre-process documents, then you are left with a _coordinating_ node that
- can only route requests, handle the search reduce phase, and distribute bulk
- indexing. Essentially, coordinating only nodes behave as smart load balancers.
- Coordinating only nodes can benefit large clusters by offloading the
- coordinating node role from data and master-eligible nodes. They join the
- cluster and receive the full <<cluster-state,cluster state>>, like every other
- node, and they use the cluster state to route requests directly to the
- appropriate place(s).
- WARNING: Adding too many coordinating only nodes to a cluster can increase the
- burden on the entire cluster because the elected master node must await
- acknowledgement of cluster state updates from every node! The benefit of
- coordinating only nodes should not be overstated -- data nodes can happily
- serve the same purpose.
- To create a dedicated coordinating node in the {default-dist}, set:
- [source,yaml]
- -------------------
- node.master: false <1>
- node.voting_only: false <2>
- node.data: false <3>
- node.ingest: false <4>
- node.ml: false <5>
- cluster.remote.connect: false <6>
- -------------------
- <1> Disable the `node.master` role (enabled by default).
- <2> The `node.voting_only` role is disabled by default.
- <3> Disable the `node.data` role (enabled by default).
- <4> Disable the `node.ingest` role (enabled by default).
- <5> Disable the `node.ml` role (enabled by default).
- <6> Disable {ccs} (enabled by default).
- To create a dedicated coordinating node in the {oss-dist}, set:
- [source,yaml]
- -------------------
- node.master: false <1>
- node.data: false <2>
- node.ingest: false <3>
- cluster.remote.connect: false <4>
- -------------------
- <1> Disable the `node.master` role (enabled by default).
- <2> Disable the `node.data` role (enabled by default).
- <3> Disable the `node.ingest` role (enabled by default).
- <4> Disable {ccs} (enabled by default).
- [float]
- [[ml-node]]
- === [xpack]#Machine learning node#
- The {ml-features} provide {ml} nodes, which run jobs and handle {ml} API
- requests. If `xpack.ml.enabled` is set to true and `node.ml` is set to `false`,
- the node can service API requests but it cannot run jobs.
- If you want to use {ml-features} in your cluster, you must enable {ml}
- (set `xpack.ml.enabled` to `true`) on all master-eligible nodes. If you have the
- {oss-dist}, do not use these settings.
- For more information about these settings, see <<ml-settings>>.
- To create a dedicated {ml} node in the {default-dist}, set:
- [source,yaml]
- -------------------
- node.master: false <1>
- node.voting_only: false <2>
- node.data: false <3>
- node.ingest: false <4>
- node.ml: true <5>
- xpack.ml.enabled: true <6>
- cluster.remote.connect: false <7>
- -------------------
- <1> Disable the `node.master` role (enabled by default).
- <2> The `node.voting_only` role is disabled by default.
- <3> Disable the `node.data` role (enabled by default).
- <4> Disable the `node.ingest` role (enabled by default).
- <5> The `node.ml` role is enabled by default.
- <6> The `xpack.ml.enabled` setting is enabled by default.
- <7> Disable {ccs} (enabled by default).
- [float]
- [[change-node-role]]
- === Changing the role of a node
- Each data node maintains the following data on disk:
- * the shard data for every shard allocated to that node,
- * the index metadata corresponding with every shard allocated to that node, and
- * the cluster-wide metadata, such as settings and index templates.
- Similarly, each master-eligible node maintains the following data on disk:
- * the index metadata for every index in the cluster, and
- * the cluster-wide metadata, such as settings and index templates.
- Each node checks the contents of its data path at startup. If it discovers
- unexpected data then it will refuse to start. This is to avoid importing
- unwanted <<modules-gateway-dangling-indices,dangling indices>> which can lead
- to a red cluster health. To be more precise, nodes with `node.data: false` will
- refuse to start if they find any shard data on disk at startup, and nodes with
- both `node.master: false` and `node.data: false` will refuse to start if they
- have any index metadata on disk at startup.
- It is possible to change the roles of a node by adjusting its
- `elasticsearch.yml` file and restarting it. This is known as _repurposing_ a
- node. In order to satisfy the checks for unexpected data described above, you
- must perform some extra steps to prepare a node for repurposing when setting
- its `node.data` or `node.master` roles to `false`:
- * If you want to repurpose a data node by changing `node.data` to `false` then
- you should first use an <<allocation-filtering,allocation filter>> to safely
- migrate all the shard data onto other nodes in the cluster.
- * If you want to repurpose a node to have both `node.master: false` and
- `node.data: false` then it is simplest to start a brand-new node with an
- empty data path and the desired roles. You may find it safest to use an
- <<allocation-filtering,allocation filter>> to migrate the shard data
- elsewhere in the cluster first.
- If it is not possible to follow these extra steps then you may be able to use
- the <<node-tool-repurpose,`elasticsearch-node repurpose`>> tool to delete any
- excess data that prevents a node from starting.
- [float]
- == Node data path settings
- [float]
- [[data-path]]
- === `path.data`
- Every data and master-eligible node requires access to a data directory where
- shards and index and cluster metadata will be stored. The `path.data` defaults
- to `$ES_HOME/data` but can be configured in the `elasticsearch.yml` config
- file an absolute path or a path relative to `$ES_HOME` as follows:
- [source,yaml]
- -----------------------
- path.data: /var/elasticsearch/data
- -----------------------
- Like all node settings, it can also be specified on the command line as:
- [source,sh]
- -----------------------
- ./bin/elasticsearch -Epath.data=/var/elasticsearch/data
- -----------------------
- TIP: When using the `.zip` or `.tar.gz` distributions, the `path.data` setting
- should be configured to locate the data directory outside the Elasticsearch
- home directory, so that the home directory can be deleted without deleting
- your data! The RPM and Debian distributions do this for you already.
- [float]
- == Other node settings
- More node settings can be found in <<modules,Modules>>. Of particular note are
- the <<cluster.name,`cluster.name`>>, the <<node.name,`node.name`>> and the
- <<modules-network,network settings>>.
|