|
@@ -1,5 +1,5 @@
|
|
|
[[modules-node]]
|
|
|
-=== Nodes
|
|
|
+=== Node settings
|
|
|
|
|
|
Any time that you start an instance of {es}, you are starting a _node_. A
|
|
|
collection of connected nodes is called a <<modules-cluster,cluster>>. If you
|
|
@@ -18,24 +18,33 @@ TIP: The performance of an {es} node is often limited by the performance of the
|
|
|
Review our recommendations for optimizing your storage for <<indexing-use-faster-hardware,indexing>> and
|
|
|
<<search-use-faster-hardware,search>>.
|
|
|
|
|
|
+[[node-name-settings]]
|
|
|
+==== Node name setting
|
|
|
+
|
|
|
+include::{es-ref-dir}/setup/important-settings/node-name.asciidoc[]
|
|
|
+
|
|
|
[[node-roles]]
|
|
|
-==== Node roles
|
|
|
+==== Node role settings
|
|
|
|
|
|
You define a node's roles by setting `node.roles` in `elasticsearch.yml`. If you
|
|
|
set `node.roles`, the node is only assigned the roles you specify. If you don't
|
|
|
set `node.roles`, the node is assigned the following roles:
|
|
|
|
|
|
-* `master`
|
|
|
-* `data`
|
|
|
+* [[master-node]]`master`
|
|
|
+* [[data-node]]`data`
|
|
|
* `data_content`
|
|
|
* `data_hot`
|
|
|
* `data_warm`
|
|
|
* `data_cold`
|
|
|
* `data_frozen`
|
|
|
* `ingest`
|
|
|
-* `ml`
|
|
|
+* [[ml-node]]`ml`
|
|
|
* `remote_cluster_client`
|
|
|
-* `transform`
|
|
|
+* [[transform-node]]`transform`
|
|
|
+
|
|
|
+The following additional roles are available:
|
|
|
+
|
|
|
+* `voting_only`
|
|
|
|
|
|
[IMPORTANT]
|
|
|
====
|
|
@@ -65,386 +74,7 @@ As the cluster grows and in particular if you have large {ml} jobs or
|
|
|
{ctransforms}, consider separating dedicated master-eligible nodes from
|
|
|
dedicated data nodes, {ml} nodes, and {transform} nodes.
|
|
|
|
|
|
-<<master-node,Master-eligible node>>::
|
|
|
-
|
|
|
-A node that has the `master` role, which makes it eligible to be
|
|
|
-<<modules-discovery,elected as the _master_ node>>, which controls the cluster.
|
|
|
-
|
|
|
-<<data-node,Data node>>::
|
|
|
-
|
|
|
-A node that has one of several data roles. Data nodes hold data and perform data
|
|
|
-related operations such as CRUD, search, and aggregations. A node with a generic `data` role can fill any of the specialized data node roles.
|
|
|
-
|
|
|
-<<node-ingest-node,Ingest node>>::
|
|
|
-
|
|
|
-A node that has the `ingest` role. Ingest nodes are able to apply an
|
|
|
-<<ingest,ingest pipeline>> to a document in order to transform and enrich the
|
|
|
-document before indexing. With a heavy ingest load, it makes sense to use
|
|
|
-dedicated ingest nodes and to not include the `ingest` role from nodes that have
|
|
|
-the `master` or `data` roles.
|
|
|
-
|
|
|
-<<remote-node,Remote-eligible node>>::
|
|
|
-
|
|
|
-A node that has the `remote_cluster_client` role, which makes it eligible to act
|
|
|
-as a remote client.
|
|
|
-
|
|
|
-<<ml-node,Machine learning node>>::
|
|
|
-
|
|
|
-A node that has the `ml` role. If you want to use {ml-features}, there must be
|
|
|
-at least one {ml} node in your cluster. For more information, see
|
|
|
-<<ml-settings>> and {ml-docs}/index.html[Machine learning in the {stack}].
|
|
|
-
|
|
|
-<<transform-node,{transform-cap} node>>::
|
|
|
-
|
|
|
-A node that has the `transform` role. If you want to use {transforms}, there
|
|
|
-must be at least one {transform} node in your cluster. For more information, see
|
|
|
-<<transform-settings>> and <<transforms>>.
|
|
|
-
|
|
|
-[NOTE]
|
|
|
-[[coordinating-node]]
|
|
|
-.Coordinating node
|
|
|
-===============================================
|
|
|
-
|
|
|
-Requests like search requests or bulk-indexing requests may involve data held
|
|
|
-on different data nodes. A search request, for example, is executed in two
|
|
|
-phases which are coordinated by the node which receives the client request --
|
|
|
-the _coordinating node_.
|
|
|
-
|
|
|
-In the _scatter_ phase, the coordinating node forwards the request to the data
|
|
|
-nodes which hold the data. Each data node executes the request locally and
|
|
|
-returns its results to the coordinating node. In the _gather_ phase, the
|
|
|
-coordinating node reduces each data node's results into a single global
|
|
|
-result set.
|
|
|
-
|
|
|
-Every node is implicitly a coordinating node. This means that a node that has
|
|
|
-an explicit empty list of roles via `node.roles` will only act as a coordinating
|
|
|
-node, which cannot be disabled. As a result, such a node needs to have enough
|
|
|
-memory and CPU in order to deal with the gather phase.
|
|
|
-
|
|
|
-===============================================
|
|
|
-
|
|
|
-[[master-node]]
|
|
|
-==== Master-eligible node
|
|
|
-
|
|
|
-The master node is responsible for lightweight cluster-wide actions such as
|
|
|
-creating or deleting an index, tracking which nodes are part of the cluster,
|
|
|
-and deciding which shards to allocate to which nodes. It is important for
|
|
|
-cluster health to have a stable master node.
|
|
|
-
|
|
|
-Any master-eligible node that is not a <<voting-only-node,voting-only node>> may
|
|
|
-be elected to become the master node by the <<modules-discovery,master election
|
|
|
-process>>.
|
|
|
-
|
|
|
-IMPORTANT: Master nodes must have a `path.data` directory whose contents
|
|
|
-persist across restarts, just like data nodes, because this is where the
|
|
|
-cluster metadata is stored. The cluster metadata describes how to read the data
|
|
|
-stored on the data nodes, so if it is lost then the data stored on the data
|
|
|
-nodes cannot be read.
|
|
|
-
|
|
|
-[[dedicated-master-node]]
|
|
|
-===== Dedicated master-eligible node
|
|
|
-
|
|
|
-It is important for the health of the cluster that the elected master node has
|
|
|
-the resources it needs to fulfill its responsibilities. If the elected master
|
|
|
-node is overloaded with other tasks then the cluster will not operate well. The
|
|
|
-most reliable way to avoid overloading the master with other tasks is to
|
|
|
-configure all the master-eligible nodes to be _dedicated master-eligible nodes_
|
|
|
-which only have the `master` role, allowing them to focus on managing the
|
|
|
-cluster. Master-eligible nodes will still also behave as
|
|
|
-<<coordinating-node,coordinating nodes>> that route requests from clients to
|
|
|
-the other nodes in the cluster, but you should _not_ use dedicated master nodes
|
|
|
-for this purpose.
|
|
|
-
|
|
|
-A small or lightly-loaded cluster may operate well if its master-eligible nodes
|
|
|
-have other roles and responsibilities, but once your cluster comprises more
|
|
|
-than a handful of nodes it usually makes sense to use dedicated master-eligible
|
|
|
-nodes.
|
|
|
-
|
|
|
-To create a dedicated master-eligible node, set:
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
--------------------
|
|
|
-node.roles: [ master ]
|
|
|
--------------------
|
|
|
-
|
|
|
-[[voting-only-node]]
|
|
|
-===== Voting-only master-eligible node
|
|
|
-
|
|
|
-A voting-only master-eligible node is a node that participates in
|
|
|
-<<modules-discovery,master elections>> but which will not act as the cluster's
|
|
|
-elected master node. In particular, a voting-only node can serve as a tiebreaker
|
|
|
-in elections.
|
|
|
-
|
|
|
-It may seem confusing to use the term "master-eligible" to describe a
|
|
|
-voting-only node since such a node is not actually eligible to become the master
|
|
|
-at all. This terminology is an unfortunate consequence of history:
|
|
|
-master-eligible nodes are those nodes that participate in elections and perform
|
|
|
-certain tasks during cluster state publications, and voting-only nodes have the
|
|
|
-same responsibilities even if they can never become the elected master.
|
|
|
-
|
|
|
-To configure a master-eligible node as a voting-only node, include `master` and
|
|
|
-`voting_only` in the list of roles. For example to create a voting-only data
|
|
|
-node:
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
--------------------
|
|
|
-node.roles: [ data, master, voting_only ]
|
|
|
--------------------
|
|
|
-
|
|
|
-IMPORTANT: Only nodes with the `master` role can be marked as having the
|
|
|
-`voting_only` role.
|
|
|
-
|
|
|
-High availability (HA) clusters require at least three master-eligible nodes, at
|
|
|
-least two of which are not voting-only nodes. Such a cluster will be able to
|
|
|
-elect a master node even if one of the nodes fails.
|
|
|
-
|
|
|
-Voting-only master-eligible nodes may also fill other roles in your cluster.
|
|
|
-For instance, a node may be both a data node and a voting-only master-eligible
|
|
|
-node. A _dedicated_ voting-only master-eligible nodes is a voting-only
|
|
|
-master-eligible node that fills no other roles in the cluster. To create a
|
|
|
-dedicated voting-only master-eligible node, set:
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
--------------------
|
|
|
-node.roles: [ master, voting_only ]
|
|
|
--------------------
|
|
|
-
|
|
|
-Since dedicated voting-only nodes never act as the cluster's elected master,
|
|
|
-they may require less heap and a less powerful CPU than the true master nodes.
|
|
|
-However all master-eligible nodes, including voting-only nodes, are on the
|
|
|
-critical path for <<cluster-state-publishing,publishing cluster state
|
|
|
-updates>>. Cluster state updates are usually independent of
|
|
|
-performance-critical workloads such as indexing or searches, but they are
|
|
|
-involved in management activities such as index creation and rollover, mapping
|
|
|
-updates, and recovery after a failure. The performance characteristics of these
|
|
|
-activities are a function of the speed of the storage on each master-eligible
|
|
|
-node, as well as the reliability and latency of the network interconnections
|
|
|
-between the elected master node and the other nodes in the cluster. You must
|
|
|
-therefore ensure that the storage and networking available to the nodes in your
|
|
|
-cluster are good enough to meet your performance goals.
|
|
|
-
|
|
|
-[[data-node]]
|
|
|
-==== Data nodes
|
|
|
-
|
|
|
-Data nodes hold the shards that contain the documents you have indexed. Data
|
|
|
-nodes handle data related operations like CRUD, search, and aggregations.
|
|
|
-These operations are I/O-, memory-, and CPU-intensive. It is important to
|
|
|
-monitor these resources and to add more data nodes if they are overloaded.
|
|
|
-
|
|
|
-The main benefit of having dedicated data nodes is the separation of the master
|
|
|
-and data roles.
|
|
|
-
|
|
|
-In a multi-tier deployment architecture, you use specialized data roles to
|
|
|
-assign data nodes to specific tiers: `data_content`,`data_hot`, `data_warm`,
|
|
|
-`data_cold`, or `data_frozen`. A node can belong to multiple tiers.
|
|
|
-
|
|
|
-If you want to include a node in all tiers, or if your cluster does not use multiple tiers, then you can use the generic `data` role.
|
|
|
-
|
|
|
-include::../how-to/shard-limits.asciidoc[]
|
|
|
-
|
|
|
-WARNING: If you assign a node to a specific tier using a specialized data role, then you shouldn't also assign it the generic `data` role. The generic `data` role takes precedence over specialized data roles.
|
|
|
-
|
|
|
-[[generic-data-node]]
|
|
|
-===== Generic data node
|
|
|
-
|
|
|
-Generic data nodes are included in all content tiers.
|
|
|
-
|
|
|
-To create a dedicated generic data node, set:
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ data ]
|
|
|
-----
|
|
|
-
|
|
|
-[[data-content-node]]
|
|
|
-===== Content data node
|
|
|
-
|
|
|
-Content data nodes are part of the content tier.
|
|
|
-include::{es-ref-dir}/datatiers.asciidoc[tag=content-tier]
|
|
|
-
|
|
|
-To create a dedicated content node, set:
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ data_content ]
|
|
|
-----
|
|
|
-
|
|
|
-[[data-hot-node]]
|
|
|
-===== Hot data node
|
|
|
-
|
|
|
-Hot data nodes are part of the hot tier.
|
|
|
-include::{es-ref-dir}/datatiers.asciidoc[tag=hot-tier]
|
|
|
-
|
|
|
-To create a dedicated hot node, set:
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ data_hot ]
|
|
|
-----
|
|
|
-
|
|
|
-[[data-warm-node]]
|
|
|
-===== Warm data node
|
|
|
-
|
|
|
-Warm data nodes are part of the warm tier.
|
|
|
-include::{es-ref-dir}/datatiers.asciidoc[tag=warm-tier]
|
|
|
-
|
|
|
-To create a dedicated warm node, set:
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ data_warm ]
|
|
|
-----
|
|
|
-
|
|
|
-[[data-cold-node]]
|
|
|
-===== Cold data node
|
|
|
-
|
|
|
-Cold data nodes are part of the cold tier.
|
|
|
-include::{es-ref-dir}/datatiers.asciidoc[tag=cold-tier]
|
|
|
-
|
|
|
-To create a dedicated cold node, set:
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ data_cold ]
|
|
|
-----
|
|
|
-
|
|
|
-[[data-frozen-node]]
|
|
|
-===== Frozen data node
|
|
|
-
|
|
|
-Frozen data nodes are part of the frozen tier.
|
|
|
-include::{es-ref-dir}/datatiers.asciidoc[tag=frozen-tier]
|
|
|
-
|
|
|
-To create a dedicated frozen node, set:
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ data_frozen ]
|
|
|
-----
|
|
|
-
|
|
|
-[[node-ingest-node]]
|
|
|
-==== Ingest node
|
|
|
-
|
|
|
-Ingest nodes can execute pre-processing pipelines, composed of one or more
|
|
|
-ingest processors. Depending on the type of operations performed by the ingest
|
|
|
-processors and the required resources, it may make sense to have dedicated
|
|
|
-ingest nodes, that will only perform this specific task.
|
|
|
-
|
|
|
-To create a dedicated ingest node, set:
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ ingest ]
|
|
|
-----
|
|
|
-
|
|
|
-[[coordinating-only-node]]
|
|
|
-==== Coordinating only node
|
|
|
-
|
|
|
-If you take away the ability to be able to handle master duties, to hold data,
|
|
|
-and pre-process documents, then you are left with a _coordinating_ node that
|
|
|
-can only route requests, handle the search reduce phase, and distribute bulk
|
|
|
-indexing. Essentially, coordinating only nodes behave as smart load balancers.
|
|
|
-
|
|
|
-Coordinating only nodes can benefit large clusters by offloading the
|
|
|
-coordinating node role from data and master-eligible nodes. They join the
|
|
|
-cluster and receive the full <<cluster-state,cluster state>>, like every other
|
|
|
-node, and they use the cluster state to route requests directly to the
|
|
|
-appropriate place(s).
|
|
|
-
|
|
|
-WARNING: Adding too many coordinating only nodes to a cluster can increase the
|
|
|
-burden on the entire cluster because the elected master node must await
|
|
|
-acknowledgement of cluster state updates from every node! The benefit of
|
|
|
-coordinating only nodes should not be overstated -- data nodes can happily
|
|
|
-serve the same purpose.
|
|
|
-
|
|
|
-To create a dedicated coordinating node, set:
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ ]
|
|
|
-----
|
|
|
-
|
|
|
-[[remote-node]]
|
|
|
-==== Remote-eligible node
|
|
|
-
|
|
|
-A remote-eligible node acts as a cross-cluster client and connects to
|
|
|
-<<remote-clusters,remote clusters>>. Once connected, you can search
|
|
|
-remote clusters using <<modules-cross-cluster-search,{ccs}>>. You can also sync
|
|
|
-data between clusters using <<xpack-ccr,{ccr}>>.
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ remote_cluster_client ]
|
|
|
-----
|
|
|
-
|
|
|
-[[ml-node]]
|
|
|
-==== [xpack]#Machine learning node#
|
|
|
-
|
|
|
-{ml-cap} nodes run jobs and handle {ml} API requests. For more information, see
|
|
|
-<<ml-settings>>.
|
|
|
-
|
|
|
-To create a dedicated {ml} node, set:
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ ml, remote_cluster_client]
|
|
|
-----
|
|
|
-
|
|
|
-The `remote_cluster_client` role is optional but strongly recommended.
|
|
|
-Otherwise, {ccs} fails when used in {ml} jobs or {dfeeds}. If you use {ccs} in
|
|
|
-your {anomaly-jobs}, the `remote_cluster_client` role is also required on all
|
|
|
-master-eligible nodes. Otherwise, the {dfeed} cannot start. See <<remote-node>>.
|
|
|
-
|
|
|
-[[transform-node]]
|
|
|
-==== [xpack]#{transform-cap} node#
|
|
|
-
|
|
|
-{transform-cap} nodes run {transforms} and handle {transform} API requests. For
|
|
|
-more information, see <<transform-settings>>.
|
|
|
-
|
|
|
-To create a dedicated {transform} node, set:
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
-----
|
|
|
-node.roles: [ transform, remote_cluster_client ]
|
|
|
-----
|
|
|
-
|
|
|
-The `remote_cluster_client` role is optional but strongly recommended.
|
|
|
-Otherwise, {ccs} fails when used in {transforms}. See <<remote-node>>.
|
|
|
-
|
|
|
-[[change-node-role]]
|
|
|
-==== Changing the role of a node
|
|
|
-
|
|
|
-Each data node maintains the following data on disk:
|
|
|
-
|
|
|
-* the shard data for every shard allocated to that node,
|
|
|
-* the index metadata corresponding with every shard allocated to that node, and
|
|
|
-* the cluster-wide metadata, such as settings and index templates.
|
|
|
-
|
|
|
-Similarly, each master-eligible node maintains the following data on disk:
|
|
|
-
|
|
|
-* the index metadata for every index in the cluster, and
|
|
|
-* the cluster-wide metadata, such as settings and index templates.
|
|
|
-
|
|
|
-Each node checks the contents of its data path at startup. If it discovers
|
|
|
-unexpected data then it will refuse to start. This is to avoid importing
|
|
|
-unwanted <<dangling-indices,dangling indices>> which can lead
|
|
|
-to a red cluster health. To be more precise, nodes without the `data` role will
|
|
|
-refuse to start if they find any shard data on disk at startup, and nodes
|
|
|
-without both the `master` and `data` roles will refuse to start if they have any
|
|
|
-index metadata on disk at startup.
|
|
|
-
|
|
|
-It is possible to change the roles of a node by adjusting its
|
|
|
-`elasticsearch.yml` file and restarting it. This is known as _repurposing_ a
|
|
|
-node. In order to satisfy the checks for unexpected data described above, you
|
|
|
-must perform some extra steps to prepare a node for repurposing when starting
|
|
|
-the node without the `data` or `master` roles.
|
|
|
-
|
|
|
-* If you want to repurpose a data node by removing the `data` role then you
|
|
|
- should first use an <<cluster-shard-allocation-filtering,allocation filter>> to safely
|
|
|
- migrate all the shard data onto other nodes in the cluster.
|
|
|
-
|
|
|
-* If you want to repurpose a node to have neither the `data` nor `master` roles
|
|
|
- then it is simplest to start a brand-new node with an empty data path and the
|
|
|
- desired roles. You may find it safest to use an
|
|
|
- <<cluster-shard-allocation-filtering,allocation filter>> to migrate the shard data elsewhere
|
|
|
- in the cluster first.
|
|
|
-
|
|
|
-If it is not possible to follow these extra steps then you may be able to use
|
|
|
-the <<node-tool-repurpose,`elasticsearch-node repurpose`>> tool to delete any
|
|
|
-excess data that prevents a node from starting.
|
|
|
+To learn more about the available node roles, see <<node-roles-overview>>.
|
|
|
|
|
|
[discrete]
|
|
|
=== Node data path settings
|
|
@@ -495,6 +125,25 @@ modify the contents of the data directory. The data directory contains no
|
|
|
executables so a virus scan will only find false positives.
|
|
|
// end::modules-node-data-path-warning-tag[]
|
|
|
|
|
|
+[[custom-node-attributes]]
|
|
|
+==== Custom node attributes
|
|
|
+
|
|
|
+If needed, you can add custom attributes to a node. These attributes can be used to <<cluster-routing-settings,filter which nodes a shard can be allocated to>>, or to group nodes together for <<shard-allocation-awareness,shard allocation awareness>>.
|
|
|
+
|
|
|
+[TIP]
|
|
|
+===============================================
|
|
|
+You can also set a node attribute using the `-E` command line argument when you start a node:
|
|
|
+
|
|
|
+[source,sh]
|
|
|
+--------------------------------------------------------
|
|
|
+./bin/elasticsearch -Enode.attr.rack_id=rack_one
|
|
|
+--------------------------------------------------------
|
|
|
+===============================================
|
|
|
+
|
|
|
+`node.attr.<attribute-name>`::
|
|
|
+ (<<dynamic-cluster-setting,Dynamic>>)
|
|
|
+ A custom attribute that you can assign to a node. For example, you might assign a `rack_id` attribute to each node to ensure that primary and replica shards are not allocated on the same rack. You can specify multiple attributes as a comma-separated list.
|
|
|
+
|
|
|
[discrete]
|
|
|
[[other-node-settings]]
|
|
|
=== Other node settings
|
|
@@ -504,4 +153,4 @@ including:
|
|
|
|
|
|
* <<cluster-name,`cluster.name`>>
|
|
|
* <<node-name,`node.name`>>
|
|
|
-* <<modules-network,network settings>>
|
|
|
+* <<modules-network,network settings>>
|