Browse Source

[DOCS] Reworked the shard allocation filtering info. (#36456)

* [DOCS] Reworked the shard allocation filtering info. Closes #36079

* Added multiple index allocation settings example back.

* Removed extraneous space
debadair 6 years ago
parent
commit
c9e03e6ead

+ 59 - 48
docs/reference/index-modules/allocation/filtering.asciidoc

@@ -1,29 +1,54 @@
 [[shard-allocation-filtering]]
-=== Shard Allocation Filtering
-
-Shard allocation filtering allows you to specify which nodes are allowed
-to host the shards of a particular index.
-
-NOTE: The per-index shard allocation filters explained below work in
-conjunction with the cluster-wide allocation filters explained in
-<<shards-allocation>>.
-
-It is possible to assign arbitrary metadata attributes to each node at
-startup.  For instance, nodes could be assigned a `rack` and a `size`
-attribute as follows:
-
+=== Index-level shard allocation filtering
+
+You can use shard allocation filters to control where {es} allocates shards of
+a particular index. These per-index filters are applied in conjunction with
+<<allocation-filtering, cluster-wide allocation filtering>> and
+<<allocation-awareness, allocation awareness>>.
+
+Shard allocation filters can be based on custom node attributes or the built-in
+`_name`, `host_ip`, `publish_ip`, `_ip`, and `_host` attributes.
+<<index-lifecycle-management, Index lifecycle management>> uses filters based
+on custom node attributes to determine how to reallocate shards when moving
+between phases.
+
+The `cluster.routing.allocation` settings are dynamic, enabling live indices to
+be moved from one set of nodes to another. Shards are only relocated if it is
+possible to do so without breaking another routing constraint, such as never
+allocating a primary and replica shard on the same node.
+
+For example, you could use a custom node attribute to indicate a node's
+performance characteristics and use shard allocation filtering to route shards
+for a particular index to the most appropriate class of hardware.
+
+[float]
+[[index-allocation-filters]]
+==== Enabling index-level shard allocation filtering
+
+To filter based on a custom node attribute:
+
+. Specify the filter characteristics with a custom node attribute in each
+node's `elasticsearch.yml` configuration file. For example, if you have `small`,
+`medium`, and `big` nodes, you could add a `size` attribute to filter based
+on node size.
++
+[source,yaml]
+--------------------------------------------------------
+node.attr.size: medium
+--------------------------------------------------------
++
+You can also set custom attributes when you start a node:
++
 [source,sh]
-------------------------
-bin/elasticsearch -Enode.attr.rack=rack1 -Enode.attr.size=big  <1>
-------------------------
-<1> These attribute settings can also be specified in the `elasticsearch.yml` config file.
-
-These metadata attributes can be used with the
-`index.routing.allocation.*` settings to allocate an index to a particular
-group of nodes.  For instance, we can move the index `test` to either `big` or
-`medium` nodes as follows:
-
-
+--------------------------------------------------------
+`./bin/elasticsearch -Enode.attr.size=medium
+--------------------------------------------------------
+
+. Add a routing allocation filter to the index. The `index.routing.allocation`
+settings support three types of filters: `include`, `exclude`, and `require`.
+For example, to tell {es} to allocate shards from the `test` index to either
+`big` or `medium` nodes, use `index.routing.allocation.include`:
++
 [source,js]
 ------------------------
 PUT test/_settings
@@ -33,24 +58,11 @@ PUT test/_settings
 ------------------------
 // CONSOLE
 // TEST[s/^/PUT test\n/]
-
-Alternatively, we can move the index `test` away from the `small` nodes with
-an `exclude` rule:
-
-[source,js]
-------------------------
-PUT test/_settings
-{
-  "index.routing.allocation.exclude.size": "small"
-}
-------------------------
-// CONSOLE
-// TEST[s/^/PUT test\n/]
-
-Multiple rules can be specified, in which case all conditions must be
-satisfied.  For instance, we could move the index `test` to `big` nodes in
-`rack1` with the following:
-
++
+If you specify multiple filters, all conditions must be satisfied for shards to
+be relocated. For example, to move the `test` index to `big` nodes in `rack1`,
+you could specify:
++
 [source,js]
 ------------------------
 PUT test/_settings
@@ -62,10 +74,9 @@ PUT test/_settings
 // CONSOLE
 // TEST[s/^/PUT test\n/]
 
-NOTE: If some conditions cannot be satisfied then shards will not be moved.
-
-The following settings are _dynamic_, allowing live indices to be moved from
-one set of nodes to another:
+[float]
+[[index-allocation-settings]]
+==== Index allocation filter settings
 
 `index.routing.allocation.include.{attribute}`::
 
@@ -82,7 +93,7 @@ one set of nodes to another:
     Assign the index to a node whose `{attribute}` has _none_ of the
     comma-separated values.
 
-These special attributes are also supported:
+The index allocation settings support the following built-in attributes:
 
 [horizontal]
 `_name`::       Match nodes by node name
@@ -91,7 +102,7 @@ These special attributes are also supported:
 `_ip`::         Match either `_host_ip` or `_publish_ip`
 `_host`::       Match nodes by hostname
 
-All attribute values can be specified with wildcards, eg:
+You can use wildcards when specifying attribute values, for example:
 
 [source,js]
 ------------------------

+ 1 - 4
docs/reference/index-modules/allocation/total_shards.asciidoc

@@ -1,5 +1,5 @@
 [[allocation-total-shards]]
-=== Total Shards Per Node
+=== Total shards per node
 
 The cluster-level shard allocator tries to spread the shards of a single index
 across as many nodes as possible.  However, depending on how many shards and
@@ -28,6 +28,3 @@ allocated.
 
 Use with caution.
 =======================================
-
-
-

+ 84 - 88
docs/reference/modules/cluster/allocation_awareness.asciidoc

@@ -1,114 +1,110 @@
 [[allocation-awareness]]
-=== Shard Allocation Awareness
+=== Shard allocation awareness
+
+You can use custom node attributes as _awareness attributes_ to enable {es}
+to take your physical hardware configuration into account when allocating shards.
+If {es} knows which nodes are on the same physical server, in the same rack, or
+in the same zone, it can distribute the primary shard and its replica shards to
+minimise the risk of losing all shard copies in the event of a failure.
+
+When shard allocation awareness is enabled with the
+`cluster.routing.allocation.awareness.attributes` setting, shards are only
+allocated to nodes that have values set for the specified awareness
+attributes. If you use multiple awareness attributes, {es} considers
+each attribute separately when allocating shards.
+
+The allocation awareness settings can be configured in
+`elasticsearch.yml` and updated dynamically with the
+<<cluster-update-settings,cluster-update-settings>> API.
 
-When running nodes on multiple VMs on the same physical server, on multiple
-racks, or across multiple zones or domains, it is more likely that two nodes on
-the same physical server, in the same rack, or in the same zone or domain will
-crash at the same time, rather than two unrelated nodes crashing
-simultaneously.
+{es} prefers using shards in the same location (with the same
+awareness attribute values) to process search or GET requests. Using local
+shards is usually faster than crossing rack or zone boundaries.
 
-If Elasticsearch is _aware_ of the physical configuration of your hardware, it
-can ensure that the primary shard and its replica shards are spread across
-different physical servers, racks, or zones, to minimise the risk of losing
-all shard copies at the same time.
+NOTE: The number of attribute values determines how many shard copies are
+allocated in each location. If the number of nodes in each location is
+unbalanced and there are a lot of replicas, replica shards might be left
+unassigned.
 
-The shard allocation awareness settings allow you to tell Elasticsearch about
-your hardware configuration.
+[float]
+[[enabling-awareness]]
+==== Enabling shard allocation awareness
 
-As an example, let's assume we have several racks.  When we start a node, we
-can tell it which rack it is in by assigning it an arbitrary metadata
-attribute called `rack_id` -- we could use any attribute name.  For example:
+To enable shard allocation awareness:
 
+. Specify the location of each node with a custom node attribute. For example,
+if you want Elasticsearch to distribute shards across different racks, you might
+set an awareness attribute called `rack_id` in each node's `elasticsearch.yml`
+config file.
++
+[source,yaml]
+--------------------------------------------------------
+node.attr.rack_id: rack_one
+--------------------------------------------------------
++
+You can also set custom attributes when you start a node:
++
 [source,sh]
-----------------------
-./bin/elasticsearch -Enode.attr.rack_id=rack_one <1>
-----------------------
-<1> This setting could also be specified in the `elasticsearch.yml` config file.
-
-Now, we need to set up _shard allocation awareness_  by telling Elasticsearch
-which attributes to use.  This can be configured in the `elasticsearch.yml`
-file on *all* master-eligible nodes, or it can be set (and changed) with the
-<<cluster-update-settings,cluster-update-settings>> API.
-
-For our example, we'll set the value in the config file:
+--------------------------------------------------------
+`./bin/elasticsearch -Enode.attr.rack_id=rack_one`
+--------------------------------------------------------
 
+. Tell {es} to take one or more awareness attributes into account when
+allocating shards by setting
+`cluster.routing.allocation.awareness.attributes` in *every* master-eligible
+node's `elasticsearch.yml` config file.
++
+--
 [source,yaml]
 --------------------------------------------------------
-cluster.routing.allocation.awareness.attributes: rack_id
+cluster.routing.allocation.awareness.attributes: rack_id <1>
 --------------------------------------------------------
-
-With this config in place, let's say we start two nodes with
-`node.attr.rack_id` set to `rack_one`, and we create an index with 5 primary
-shards and 1 replica of each primary.  All primaries and replicas are
+<1> Specify multiple attributes as a comma-separated list.
+--
++
+You can also use the
+<<cluster-update-settings,cluster-update-settings>> API to set or update
+a cluster's awareness attributes.
+
+With this example configuration, if you start two nodes with
+`node.attr.rack_id` set to `rack_one` and create an index with 5 primary
+shards and 1 replica of each primary, all primaries and replicas are
 allocated across the two nodes.
 
-Now, if we start two more nodes with `node.attr.rack_id` set to `rack_two`,
-Elasticsearch will move shards across to the new nodes, ensuring (if possible)
-that no two copies of the same shard will be in the same rack. However if
-`rack_two` were to fail, taking down both of its nodes, Elasticsearch will
-still allocate the lost shard copies to nodes in `rack_one`. 
-
-.Prefer local shards
-*********************************************
-
-When executing search or GET requests, with shard awareness enabled,
-Elasticsearch will prefer using local shards -- shards in the same awareness
-group -- to execute the request. This is usually faster than crossing between
-racks or across zone boundaries.
+If you add two nodes with `node.attr.rack_id` set to `rack_two`,
+{es} moves shards to the new nodes, ensuring (if possible)
+that no two copies of the same shard are in the same rack.
 
-*********************************************
-
-Multiple awareness attributes can be specified, in which case each attribute
-is considered separately when deciding where to allocate the shards.
-
-[source,yaml]
--------------------------------------------------------------
-cluster.routing.allocation.awareness.attributes: rack_id,zone
--------------------------------------------------------------
-
-NOTE: When using awareness attributes, shards will not be allocated to nodes
-that don't have values set for those attributes.
-
-NOTE: Number of primary/replica of a shard allocated on a specific group of
-nodes with the same awareness attribute value is determined by the number of
-attribute values. When the number of nodes in groups is unbalanced and there
-are many replicas, replica shards may be left unassigned.
+If `rack_two` fails and takes down both its nodes, by default {es}
+allocates the lost shard copies to nodes in `rack_one`. To prevent multiple
+copies of a particular shard from being allocated in the same location, you can
+enable forced awareness.
 
 [float]
 [[forced-awareness]]
-=== Forced Awareness
-
-Imagine that you have two zones and enough hardware across the two zones to
-host all of your primary and replica shards.  But perhaps the hardware in a
-single zone, while sufficient to host half the shards, would be unable to host
-*ALL* the shards.
+==== Forced awareness
 
-With ordinary awareness, if one zone lost contact with the other zone,
-Elasticsearch would assign all of the missing replica shards to a single zone.
-But in this example, this sudden extra load would cause the hardware in the
-remaining zone to be overloaded.
+By default, if one location fails, Elasticsearch assigns all of the missing
+replica shards to the remaining locations. While you might have sufficient
+resources across all locations to host your primary and replica shards, a single
+location might be unable to host *ALL* of the shards.
 
-Forced awareness solves this problem by *NEVER* allowing copies of the same
-shard to be allocated to the same zone.
+To prevent a single location from being overloaded in the event of a failure,
+you can set `cluster.routing.allocation.awareness.force` so no replicas are
+allocated until nodes are available in another location.
 
-For example, lets say we have an awareness attribute called `zone`, and we
-know we are going to have two zones, `zone1` and `zone2`. Here is how we can
-force awareness on a node:
+For example, if you have an awareness attribute called `zone` and configure nodes
+in `zone1` and `zone2`, you can use forced awareness to prevent Elasticsearch
+from allocating replicas if only one zone is available:
 
 [source,yaml]
 -------------------------------------------------------------------
-cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 <1>
 cluster.routing.allocation.awareness.attributes: zone
+cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 <1>
 -------------------------------------------------------------------
-<1> We must list all possible values that the `zone` attribute can have.
-
-Now, if we start 2 nodes with `node.attr.zone` set to `zone1` and create an
-index with 5 shards and 1 replica. The index will be created, but only the 5
-primary shards will be allocated (with no replicas). Only when we start more
-nodes with `node.attr.zone` set to `zone2` will the replicas be allocated.
-
-The `cluster.routing.allocation.awareness.*` settings can all be updated
-dynamically on a live cluster with the
-<<cluster-update-settings,cluster-update-settings>> API.
-
+<1> Specify all possible values for the awareness attribute.
 
+With this example configuration, if you start two nodes with `node.attr.zone` set
+to `zone1` and create an index with 5 shards and 1 replica, Elasticsearch creates
+the index and allocates the 5 primary shards but no replicas. Replicas are
+only allocated once nodes with `node.attr.zone` set to `zone2` are available.

+ 33 - 31
docs/reference/modules/cluster/allocation_filtering.asciidoc

@@ -1,13 +1,37 @@
 [[allocation-filtering]]
-=== Shard Allocation Filtering
+=== Cluster-level shard allocation filtering
 
-While <<index-modules-allocation>> provides *per-index* settings to control the
-allocation of shards to nodes, cluster-level shard allocation filtering allows
-you to allow or disallow the allocation of shards from *any* index to
-particular nodes.
+You can use cluster-level shard allocation filters to control where {es}
+allocates shards from any index. These cluster wide filters are applied in
+conjunction with <<shard-allocation-filtering, per-index allocation filtering>>
+and <<allocation-awareness, allocation awareness>>.
 
-The available _dynamic_ cluster settings are as follows, where `{attribute}`
-refers to an arbitrary node attribute.:
+Shard allocation filters can be based on custom node attributes or the built-in
+`_name`, `_ip`, and `_host` attributes.
+
+The `cluster.routing.allocation` settings are dynamic, enabling live indices to
+be moved from one set of nodes to another. Shards are only relocated if it is
+possible to do so without breaking another routing constraint, such as never
+allocating a primary and replica shard on the same node.
+
+The most common use case for cluster-level shard allocation filtering is when
+you want to decommission a node. To move shards off of a node prior to shutting
+it down, you could create a filter that excludes the node by its IP address:
+
+[source,js]
+--------------------------------------------------
+PUT _cluster/settings
+{
+  "transient" : {
+    "cluster.routing.allocation.exclude._ip" : "10.0.0.1"
+  }
+}
+--------------------------------------------------
+// CONSOLE
+
+[float]
+[[cluster-routing-settings]]
+==== Cluster routing settings
 
 `cluster.routing.allocation.include.{attribute}`::
 
@@ -24,36 +48,14 @@ refers to an arbitrary node attribute.:
     Do not allocate shards to a node whose `{attribute}` has _any_ of the
     comma-separated values.
 
-These special attributes are also supported:
+The cluster allocation settings support the following built-in attributes:
 
 [horizontal]
 `_name`::   Match nodes by node names
 `_ip`::     Match nodes by IP addresses (the IP address associated with the hostname)
 `_host`::   Match nodes by hostnames
 
-The typical use case for cluster-wide shard allocation filtering is when you
-want to decommission a node, and you would like to move the shards from that
-node to other nodes in the cluster before shutting it down.
-
-For instance, we could decommission a node using its IP address as follows:
-
-[source,js]
---------------------------------------------------
-PUT _cluster/settings
-{
-  "transient" : {
-    "cluster.routing.allocation.exclude._ip" : "10.0.0.1"
-  }
-}
---------------------------------------------------
-// CONSOLE
-
-NOTE: Shards will only be relocated if it is possible to do so without
-breaking another routing constraint, such as never allocating a primary and
-replica shard to the same node.
-
-In addition to listing multiple values as a comma-separated list, all
-attribute values can be specified with wildcards, eg:
+You can use wildcards when specifying attribute values, for example:
 
 [source,js]
 ------------------------

+ 1 - 1
docs/reference/modules/cluster/disk_allocator.asciidoc

@@ -1,5 +1,5 @@
 [[disk-allocator]]
-=== Disk-based Shard Allocation
+=== Disk-based shard allocation
 
 Elasticsearch considers the available disk space on a node before deciding
 whether to allocate new shards to that node or to actively relocate shards away

+ 4 - 4
docs/reference/modules/cluster/shards_allocation.asciidoc

@@ -1,12 +1,12 @@
 [[shards-allocation]]
-=== Cluster Level Shard Allocation
+=== Cluster level shard allocation
 
 Shard allocation is the process of allocating shards to nodes. This can
 happen during initial recovery, replica allocation, rebalancing, or
 when nodes are added or removed.
 
 [float]
-=== Shard Allocation Settings
+=== Shard allocation settings
 
 The following _dynamic_ settings may be used to control shard allocation and recovery:
 
@@ -59,7 +59,7 @@ one of the active allocation ids in the cluster state.
       setting only applies if multiple nodes are started on the same machine.
 
 [float]
-=== Shard Rebalancing Settings
+=== Shard rebalancing settings
 
 The following _dynamic_ settings may be used to control the rebalancing of
 shards across the cluster:
@@ -98,7 +98,7 @@ Specify when shard rebalancing is allowed:
       or <<forced-awareness,forced awareness>>.
 
 [float]
-=== Shard Balancing Heuristics
+=== Shard balancing heuristics
 
 The following settings are used together to determine where to place each
 shard.  The cluster is balanced when no allowed rebalancing operation can bring the weight