|
@@ -1,114 +1,110 @@
|
|
|
[[allocation-awareness]]
|
|
|
-=== Shard Allocation Awareness
|
|
|
+=== Shard allocation awareness
|
|
|
+
|
|
|
+You can use custom node attributes as _awareness attributes_ to enable {es}
|
|
|
+to take your physical hardware configuration into account when allocating shards.
|
|
|
+If {es} knows which nodes are on the same physical server, in the same rack, or
|
|
|
+in the same zone, it can distribute the primary shard and its replica shards to
|
|
|
+minimise the risk of losing all shard copies in the event of a failure.
|
|
|
+
|
|
|
+When shard allocation awareness is enabled with the
|
|
|
+`cluster.routing.allocation.awareness.attributes` setting, shards are only
|
|
|
+allocated to nodes that have values set for the specified awareness
|
|
|
+attributes. If you use multiple awareness attributes, {es} considers
|
|
|
+each attribute separately when allocating shards.
|
|
|
+
|
|
|
+The allocation awareness settings can be configured in
|
|
|
+`elasticsearch.yml` and updated dynamically with the
|
|
|
+<<cluster-update-settings,cluster-update-settings>> API.
|
|
|
|
|
|
-When running nodes on multiple VMs on the same physical server, on multiple
|
|
|
-racks, or across multiple zones or domains, it is more likely that two nodes on
|
|
|
-the same physical server, in the same rack, or in the same zone or domain will
|
|
|
-crash at the same time, rather than two unrelated nodes crashing
|
|
|
-simultaneously.
|
|
|
+{es} prefers using shards in the same location (with the same
|
|
|
+awareness attribute values) to process search or GET requests. Using local
|
|
|
+shards is usually faster than crossing rack or zone boundaries.
|
|
|
|
|
|
-If Elasticsearch is _aware_ of the physical configuration of your hardware, it
|
|
|
-can ensure that the primary shard and its replica shards are spread across
|
|
|
-different physical servers, racks, or zones, to minimise the risk of losing
|
|
|
-all shard copies at the same time.
|
|
|
+NOTE: The number of attribute values determines how many shard copies are
|
|
|
+allocated in each location. If the number of nodes in each location is
|
|
|
+unbalanced and there are a lot of replicas, replica shards might be left
|
|
|
+unassigned.
|
|
|
|
|
|
-The shard allocation awareness settings allow you to tell Elasticsearch about
|
|
|
-your hardware configuration.
|
|
|
+[float]
|
|
|
+[[enabling-awareness]]
|
|
|
+==== Enabling shard allocation awareness
|
|
|
|
|
|
-As an example, let's assume we have several racks. When we start a node, we
|
|
|
-can tell it which rack it is in by assigning it an arbitrary metadata
|
|
|
-attribute called `rack_id` -- we could use any attribute name. For example:
|
|
|
+To enable shard allocation awareness:
|
|
|
|
|
|
+. Specify the location of each node with a custom node attribute. For example,
|
|
|
+if you want Elasticsearch to distribute shards across different racks, you might
|
|
|
+set an awareness attribute called `rack_id` in each node's `elasticsearch.yml`
|
|
|
+config file.
|
|
|
++
|
|
|
+[source,yaml]
|
|
|
+--------------------------------------------------------
|
|
|
+node.attr.rack_id: rack_one
|
|
|
+--------------------------------------------------------
|
|
|
++
|
|
|
+You can also set custom attributes when you start a node:
|
|
|
++
|
|
|
[source,sh]
|
|
|
-----------------------
|
|
|
-./bin/elasticsearch -Enode.attr.rack_id=rack_one <1>
|
|
|
-----------------------
|
|
|
-<1> This setting could also be specified in the `elasticsearch.yml` config file.
|
|
|
-
|
|
|
-Now, we need to set up _shard allocation awareness_ by telling Elasticsearch
|
|
|
-which attributes to use. This can be configured in the `elasticsearch.yml`
|
|
|
-file on *all* master-eligible nodes, or it can be set (and changed) with the
|
|
|
-<<cluster-update-settings,cluster-update-settings>> API.
|
|
|
-
|
|
|
-For our example, we'll set the value in the config file:
|
|
|
+--------------------------------------------------------
|
|
|
+`./bin/elasticsearch -Enode.attr.rack_id=rack_one`
|
|
|
+--------------------------------------------------------
|
|
|
|
|
|
+. Tell {es} to take one or more awareness attributes into account when
|
|
|
+allocating shards by setting
|
|
|
+`cluster.routing.allocation.awareness.attributes` in *every* master-eligible
|
|
|
+node's `elasticsearch.yml` config file.
|
|
|
++
|
|
|
+--
|
|
|
[source,yaml]
|
|
|
--------------------------------------------------------
|
|
|
-cluster.routing.allocation.awareness.attributes: rack_id
|
|
|
+cluster.routing.allocation.awareness.attributes: rack_id <1>
|
|
|
--------------------------------------------------------
|
|
|
-
|
|
|
-With this config in place, let's say we start two nodes with
|
|
|
-`node.attr.rack_id` set to `rack_one`, and we create an index with 5 primary
|
|
|
-shards and 1 replica of each primary. All primaries and replicas are
|
|
|
+<1> Specify multiple attributes as a comma-separated list.
|
|
|
+--
|
|
|
++
|
|
|
+You can also use the
|
|
|
+<<cluster-update-settings,cluster-update-settings>> API to set or update
|
|
|
+a cluster's awareness attributes.
|
|
|
+
|
|
|
+With this example configuration, if you start two nodes with
|
|
|
+`node.attr.rack_id` set to `rack_one` and create an index with 5 primary
|
|
|
+shards and 1 replica of each primary, all primaries and replicas are
|
|
|
allocated across the two nodes.
|
|
|
|
|
|
-Now, if we start two more nodes with `node.attr.rack_id` set to `rack_two`,
|
|
|
-Elasticsearch will move shards across to the new nodes, ensuring (if possible)
|
|
|
-that no two copies of the same shard will be in the same rack. However if
|
|
|
-`rack_two` were to fail, taking down both of its nodes, Elasticsearch will
|
|
|
-still allocate the lost shard copies to nodes in `rack_one`.
|
|
|
-
|
|
|
-.Prefer local shards
|
|
|
-*********************************************
|
|
|
-
|
|
|
-When executing search or GET requests, with shard awareness enabled,
|
|
|
-Elasticsearch will prefer using local shards -- shards in the same awareness
|
|
|
-group -- to execute the request. This is usually faster than crossing between
|
|
|
-racks or across zone boundaries.
|
|
|
+If you add two nodes with `node.attr.rack_id` set to `rack_two`,
|
|
|
+{es} moves shards to the new nodes, ensuring (if possible)
|
|
|
+that no two copies of the same shard are in the same rack.
|
|
|
|
|
|
-*********************************************
|
|
|
-
|
|
|
-Multiple awareness attributes can be specified, in which case each attribute
|
|
|
-is considered separately when deciding where to allocate the shards.
|
|
|
-
|
|
|
-[source,yaml]
|
|
|
--------------------------------------------------------------
|
|
|
-cluster.routing.allocation.awareness.attributes: rack_id,zone
|
|
|
--------------------------------------------------------------
|
|
|
-
|
|
|
-NOTE: When using awareness attributes, shards will not be allocated to nodes
|
|
|
-that don't have values set for those attributes.
|
|
|
-
|
|
|
-NOTE: Number of primary/replica of a shard allocated on a specific group of
|
|
|
-nodes with the same awareness attribute value is determined by the number of
|
|
|
-attribute values. When the number of nodes in groups is unbalanced and there
|
|
|
-are many replicas, replica shards may be left unassigned.
|
|
|
+If `rack_two` fails and takes down both its nodes, by default {es}
|
|
|
+allocates the lost shard copies to nodes in `rack_one`. To prevent multiple
|
|
|
+copies of a particular shard from being allocated in the same location, you can
|
|
|
+enable forced awareness.
|
|
|
|
|
|
[float]
|
|
|
[[forced-awareness]]
|
|
|
-=== Forced Awareness
|
|
|
-
|
|
|
-Imagine that you have two zones and enough hardware across the two zones to
|
|
|
-host all of your primary and replica shards. But perhaps the hardware in a
|
|
|
-single zone, while sufficient to host half the shards, would be unable to host
|
|
|
-*ALL* the shards.
|
|
|
+==== Forced awareness
|
|
|
|
|
|
-With ordinary awareness, if one zone lost contact with the other zone,
|
|
|
-Elasticsearch would assign all of the missing replica shards to a single zone.
|
|
|
-But in this example, this sudden extra load would cause the hardware in the
|
|
|
-remaining zone to be overloaded.
|
|
|
+By default, if one location fails, Elasticsearch assigns all of the missing
|
|
|
+replica shards to the remaining locations. While you might have sufficient
|
|
|
+resources across all locations to host your primary and replica shards, a single
|
|
|
+location might be unable to host *ALL* of the shards.
|
|
|
|
|
|
-Forced awareness solves this problem by *NEVER* allowing copies of the same
|
|
|
-shard to be allocated to the same zone.
|
|
|
+To prevent a single location from being overloaded in the event of a failure,
|
|
|
+you can set `cluster.routing.allocation.awareness.force` so no replicas are
|
|
|
+allocated until nodes are available in another location.
|
|
|
|
|
|
-For example, lets say we have an awareness attribute called `zone`, and we
|
|
|
-know we are going to have two zones, `zone1` and `zone2`. Here is how we can
|
|
|
-force awareness on a node:
|
|
|
+For example, if you have an awareness attribute called `zone` and configure nodes
|
|
|
+in `zone1` and `zone2`, you can use forced awareness to prevent Elasticsearch
|
|
|
+from allocating replicas if only one zone is available:
|
|
|
|
|
|
[source,yaml]
|
|
|
-------------------------------------------------------------------
|
|
|
-cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 <1>
|
|
|
cluster.routing.allocation.awareness.attributes: zone
|
|
|
+cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 <1>
|
|
|
-------------------------------------------------------------------
|
|
|
-<1> We must list all possible values that the `zone` attribute can have.
|
|
|
-
|
|
|
-Now, if we start 2 nodes with `node.attr.zone` set to `zone1` and create an
|
|
|
-index with 5 shards and 1 replica. The index will be created, but only the 5
|
|
|
-primary shards will be allocated (with no replicas). Only when we start more
|
|
|
-nodes with `node.attr.zone` set to `zone2` will the replicas be allocated.
|
|
|
-
|
|
|
-The `cluster.routing.allocation.awareness.*` settings can all be updated
|
|
|
-dynamically on a live cluster with the
|
|
|
-<<cluster-update-settings,cluster-update-settings>> API.
|
|
|
-
|
|
|
+<1> Specify all possible values for the awareness attribute.
|
|
|
|
|
|
+With this example configuration, if you start two nodes with `node.attr.zone` set
|
|
|
+to `zone1` and create an index with 5 shards and 1 replica, Elasticsearch creates
|
|
|
+the index and allocates the 5 primary shards but no replicas. Replicas are
|
|
|
+only allocated once nodes with `node.attr.zone` set to `zone2` are available.
|