123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179 |
- [[modules-node]]
- === Node settings
- Any time that you start an instance of {es}, you are starting a _node_. A
- collection of connected nodes is called a <<modules-cluster,cluster>>. If you
- are running a single node of {es}, then you have a cluster of one node.
- Every node in the cluster can handle <<modules-network,HTTP and transport>>
- traffic by default. The transport layer is used exclusively for communication
- between nodes; the HTTP layer is used by REST clients.
- [[modules-node-description]]
- // tag::modules-node-description-tag[]
- All nodes know about all the other nodes in the cluster and can forward client
- requests to the appropriate node.
- // end::modules-node-description-tag[]
- TIP: The performance of an {es} node is often limited by the performance of the underlying storage.
- Review our recommendations for optimizing your storage for <<indexing-use-faster-hardware,indexing>> and
- <<search-use-faster-hardware,search>>.
- [[node-name-settings]]
- ==== Node name setting
- include::{es-ref-dir}/setup/important-settings/node-name.asciidoc[]
- [[node-roles]]
- ==== Node role settings
- You define a node's roles by setting `node.roles` in `elasticsearch.yml`. If you
- set `node.roles`, the node is only assigned the roles you specify. If you don't
- set `node.roles`, the node is assigned the following roles:
- * [[master-node]]`master`
- * [[data-node]]`data`
- * `data_content`
- * `data_hot`
- * `data_warm`
- * `data_cold`
- * `data_frozen`
- * `ingest`
- * [[ml-node]]`ml`
- * `remote_cluster_client`
- * [[transform-node]]`transform`
- The following additional roles are available:
- * `voting_only`
- [NOTE]
- [[coordinating-only-node]]
- .Coordinating node
- ===============================================
- Requests like search requests or bulk-indexing requests may involve data held
- on different data nodes. A search request, for example, is executed in two
- phases which are coordinated by the node which receives the client request --
- the _coordinating node_.
- In the _scatter_ phase, the coordinating node forwards the request to the data
- nodes which hold the data. Each data node executes the request locally and
- returns its results to the coordinating node. In the _gather_ phase, the
- coordinating node reduces each data node's results into a single global
- result set.
- Every node is implicitly a coordinating node. This means that a node that has
- an explicit empty list of roles in the `node.roles` setting will only act as a coordinating
- node, which cannot be disabled. As a result, such a node needs to have enough
- memory and CPU in order to deal with the gather phase.
- ===============================================
- [IMPORTANT]
- ====
- If you set `node.roles`, ensure you specify every node role your cluster needs.
- Every cluster requires the following node roles:
- * `master`
- * {blank}
- +
- --
- `data_content` and `data_hot` +
- OR +
- `data`
- --
- Some {stack} features also require specific node roles:
- - {ccs-cap} and {ccr} require the `remote_cluster_client` role.
- - {stack-monitor-app} and ingest pipelines require the `ingest` role.
- - {fleet}, the {security-app}, and {transforms} require the `transform` role.
- The `remote_cluster_client` role is also required to use {ccs} with these
- features.
- - {ml-cap} features, such as {anomaly-detect}, require the `ml` role.
- ====
- As the cluster grows and in particular if you have large {ml} jobs or
- {ctransforms}, consider separating dedicated master-eligible nodes from
- dedicated data nodes, {ml} nodes, and {transform} nodes.
- To learn more about the available node roles, see <<node-roles-overview>>.
- [discrete]
- === Node data path settings
- [[data-path]]
- ==== `path.data`
- Every data and master-eligible node requires access to a data directory where
- shards and index and cluster metadata will be stored. The `path.data` defaults
- to `$ES_HOME/data` but can be configured in the `elasticsearch.yml` config
- file an absolute path or a path relative to `$ES_HOME` as follows:
- [source,yaml]
- ----
- path.data: /var/elasticsearch/data
- ----
- Like all node settings, it can also be specified on the command line as:
- [source,sh]
- ----
- ./bin/elasticsearch -Epath.data=/var/elasticsearch/data
- ----
- The contents of the `path.data` directory must persist across restarts, because
- this is where your data is stored. {es} requires the filesystem to act as if it
- were backed by a local disk, but this means that it will work correctly on
- properly-configured remote block devices (e.g. a SAN) and remote filesystems
- (e.g. NFS) as long as the remote storage behaves no differently from local
- storage. You can run multiple {es} nodes on the same filesystem, but each {es}
- node must have its own data path.
- TIP: When using the `.zip` or `.tar.gz` distributions, the `path.data` setting
- should be configured to locate the data directory outside the {es} home
- directory, so that the home directory can be deleted without deleting your data!
- The RPM and Debian distributions do this for you already.
- // tag::modules-node-data-path-warning-tag[]
- WARNING: Don't modify anything within the data directory or run processes that
- might interfere with its contents. If something other than {es} modifies the
- contents of the data directory, then {es} may fail, reporting corruption or
- other data inconsistencies, or may appear to work correctly having silently
- lost some of your data. Don't attempt to take filesystem backups of the data
- directory; there is no supported way to restore such a backup. Instead, use
- <<snapshot-restore>> to take backups safely. Don't run virus scanners on the
- data directory. A virus scanner can prevent {es} from working correctly and may
- modify the contents of the data directory. The data directory contains no
- executables so a virus scan will only find false positives.
- // end::modules-node-data-path-warning-tag[]
- [[custom-node-attributes]]
- ==== Custom node attributes
- If needed, you can add custom attributes to a node. These attributes can be used to <<cluster-routing-settings,filter which nodes a shard can be allocated to>>, or to group nodes together for <<shard-allocation-awareness,shard allocation awareness>>.
- [TIP]
- ===============================================
- You can also set a node attribute using the `-E` command line argument when you start a node:
- [source,sh]
- --------------------------------------------------------
- ./bin/elasticsearch -Enode.attr.rack_id=rack_one
- --------------------------------------------------------
- ===============================================
- `node.attr.<attribute-name>`::
- (<<dynamic-cluster-setting,Dynamic>>)
- A custom attribute that you can assign to a node. For example, you might assign a `rack_id` attribute to each node to ensure that primary and replica shards are not allocated on the same rack. You can specify multiple attributes as a comma-separated list.
- [discrete]
- [[other-node-settings]]
- === Other node settings
- More node settings can be found in <<settings>> and <<important-settings>>,
- including:
- * <<cluster-name,`cluster.name`>>
- * <<node-name,`node.name`>>
- * <<modules-network,network settings>>
|