Sfoglia il codice sorgente

How to increase node capacity docs (#87188)

This adds troubleshooting documentation for the case when the ShardsAvailabilityHealthIndicatorService
reports that there are not enough nodes in the data tier (user action "increase_node_capacity_for_allocations" or
"increase_tier_capacity_for_allocations_". This covers both the cloud and self-managed environments. For
cloud we first recommend increasing the number of availability zones (because you cannot directly add nodes), and
decreasing index.number_of_replicas if that is not possible. For self-managed, we first recommend adding nodes,
and decreasing index.number_of_replicas if that is not possible.
Keith Massey 3 anni fa
parent
commit
6caf39c109

+ 40 - 0
docs/reference/tab-widgets/troubleshooting/data/increase-tier-capacity-widget.asciidoc

@@ -0,0 +1,40 @@
+++++
+<div class="tabs" data-tab-group="host">
+  <div role="tablist" aria-label="Cluster shards limit">
+    <button role="tab"
+            aria-selected="true"
+            aria-controls="cloud-tab-node-capacity"
+            id="cloud-node-capacity">
+      Elasticsearch Service
+    </button>
+    <button role="tab"
+            aria-selected="false"
+            aria-controls="self-managed-tab-node-capacity"
+            id="self-managed-node-capacity"
+            tabindex="-1">
+      Self-managed
+    </button>
+  </div>
+  <div tabindex="0"
+       role="tabpanel"
+       id="cloud-tab-node-capacity"
+       aria-labelledby="cloud-node-capacity">
+++++
+
+include::increase-tier-capacity.asciidoc[tag=cloud]
+
+++++
+  </div>
+  <div tabindex="0"
+       role="tabpanel"
+       id="self-managed-tab-node-capacity"
+       aria-labelledby="self-managed-node-capacity"
+       hidden="">
+++++
+
+include::increase-tier-capacity.asciidoc[tag=self-managed]
+
+++++
+  </div>
+</div>
+++++

+ 320 - 0
docs/reference/tab-widgets/troubleshooting/data/increase-tier-capacity.asciidoc

@@ -0,0 +1,320 @@
+//////////////////////////
+
+[source,console]
+--------------------------------------------------
+PUT my-index-000001
+
+--------------------------------------------------
+// TESTSETUP
+
+[source,console]
+--------------------------------------------------
+PUT /my-index-000001/_settings
+{
+  "index" : {
+    "number_of_replicas" : 2
+  }
+}
+
+DELETE my-index-000001
+--------------------------------------------------
+// TEARDOWN
+
+//////////////////////////
+
+// tag::cloud[]
+One way to get the replica shards assigned is to add an availability zone. This will increase the number
+of data nodes in the {es} cluster so that the replica shards can be assigned. This can be done by
+editing your deployment. But first you need to discover which tier an index is targeting for assignment.
+Do this using {kib}.
+
+**Use {kib}**
+
+//tag::kibana-api-ex-1[]
+. Log in to the {ess-console}[{ecloud} console].
++
+
+. On the **Elasticsearch Service** panel, click the name of your deployment.
++
+
+NOTE:
+If the name of your deployment is disabled your {kib} instances might be
+unhealthy, in which case please contact https://support.elastic.co[Elastic Support].
+If your deployment doesn't include {kib}, all you need to do is
+{cloud}/ec-access-kibana.html[enable it first].
+
+. Open your deployment's side navigation menu (placed under the Elastic logo in the upper left corner)
+and go to **Dev Tools > Console**.
++
+[role="screenshot"]
+image::images/kibana-console.png[{kib} Console,align="center"]
+
+To inspect which tier an index is targeting for assignment, use the <<indices-get-settings, get index setting>>
+API to retrieve the configured value for the `index.routing.allocation.include._tier_preference`
+setting:
+
+[source,console]
+----
+GET /my-index-000001/_settings/index.routing.allocation.include._tier_preference?flat_settings
+----
+// TESTRESPONSE[skip:the result is for illustrating purposes only as don't want to change a cluster-wide setting]
+
+
+The response will look like this:
+
+[source,console-result]
+----
+{
+  "my-index-000001": {
+    "settings": {
+      "index.routing.allocation.include._tier_preference": "data_warm,data_hot" <1>
+    }
+  }
+}
+----
+// TESTRESPONSE[skip:the result is for illustrating purposes only]
+
+<1> Represents a comma separated list of data tier node roles this index is allowed
+to be allocated on, the first one in the list being the one with the higher priority
+i.e. the tier the index is targeting.
+e.g. in this example the tier preference is `data_warm,data_hot` so the index is
+targeting the `warm` tier and more nodes with the `data_warm` role are needed in
+the {es} cluster.
+
+//end::kibana-api-ex-2[]
+
+//tag::increase-azs[]
+[increase-azs]
+Now that you know the tier, you want to increase the number of nodes in that tier so that
+the replicas can be allocated. To do this you can either increase the size per zone to
+increase the number of nodes in the availability zone(s) you were already using, or
+increase the number of availability zones. Go back to the deployment's landing page by clicking on the
+three horizontal bars on the top left of the screen and choosing **Manage this deployment**. On that
+page click the **Manage** button, and choose **Edit deployment**. Note that you must be logged in to
+https://cloud.elastic.co/ in order to do this. In the {es} section, find the tier where the replica
+shards could not be assigned.
+
+[role="screenshot"]
+image::images/data-tiers/ess-advanced-config-data-tiers.png[{kib} Console,align="center"]
+
+* Option 1: Increase the size per zone
+** Look at the values in the **Size per zone** drop down. One node is created in each zone for every 64 GB
+of RAM you select here. If you currently have 64 GB RAM or less selected, you have one node in each zone.
+If you select 128 GB RAM, you will get 2 nodes per zone. If you select 192 GB RAM, you will get 3 nodes
+per zone, and so on. If the value is less than the maximum possible, you can choose a higher value for
+that tier to add more nodes.
+
+* Option 2: Increase the number of availability zones
+** Find the **Availability zones** selection. If it is less than 3, you can select a higher number of availability
+zones for that tier.
+
+//end::increase-azs[]
+
+If it is not possible to increase the size per zone or the number of availability zones, you can reduce the
+number of replicas of your index data. We'll achieve this by inspecting the
+<<dynamic-index-number-of-replicas,`index.number_of_replicas`>> index setting index setting and decreasing
+the configured value.
+
+//tag::kibana-api-ex-2[]
+. Access {kib} as described above.
+
+. Inspect the <<dynamic-index-number-of-replicas,`index.number_of_replicas`>> index setting.
++
+[source,console]
+----
+GET /my-index-000001/_settings/index.number_of_replicas
+----
++
+The response will look like this:
++
+[source,console-result]
+----
+{
+  "my-index-000001" : {
+    "settings" : {
+      "index" : {
+        "number_of_replicas" : "2" <1>
+      }
+    }
+  }
+}
+----
+// TESTRESPONSE[skip:the result is for illustrating purposes only]
+
++
+<1> Represents the currently configured value for the number of replica shards
+required for the index
+
+. Use the `_cat/nodes` API to find the number of nodes in the target tier:
++
+[source,console]
+----
+GET /_cat/nodes?h=node.role
+----
+// TEST[continued]
++
+The response will look like this, containing one row per node:
++
+[source,console-result]
+----
+himrst
+mv
+himrst
+----
+// TESTRESPONSE[skip:the result is for illustrating purposes only]
++
+You can count the rows containing the letter representing the target tier to know how many nodes you have. See
+<<cat-nodes-api-query-params>> for details. The example above has two rows containing `h`, so there are two
+nodes in the hot tier.
++
+. <<indices-update-settings, Decrease>> the value for the total number of
+replica shards required for this index. As replica shards cannot reside on the
+same node as primary shards for <<high-availability-cluster-design,high
+availability>>, the new value needs to be less than or equal to the number of nodes found
+above minus one. Since the example above found 2 nodes in the hot tier,
+the maximum value for `index.number_of_replicas` is 1.
++
+[source,console]
+----
+PUT /my-index-000001/_settings
+{
+  "index" : {
+    "number_of_replicas" : 1 <1>
+  }
+}
+----
+// TEST[continued]
+
++
+<1> The new value for the `index.number_of_replicas` index configuration
+is decreased from the previous value of `2` to `1`. It can be set as low as 0
+but configuring it to 0 for indices other than <<searchable-snapshots,
+searchable snapshot indices>> may lead to temporary availability loss during
+node restarts or permanent data loss in case of data corruption.
+
+//end::kibana-api-ex-2[]
+// end::cloud[]
+
+// tag::self-managed[]
+In order to get the replica shards assigned you can add more nodes to your {es} cluster
+and assign the index's target tier <<assign-data-tier, node role>> to the new
+nodes.
+
+To inspect which tier an index is targeting for assignment, use the <<indices-get-settings, get index setting>>
+API to retrieve the configured value for the `index.routing.allocation.include._tier_preference`
+setting:
+
+[source,console]
+----
+GET /my-index-000001/_settings/index.routing.allocation.include._tier_preference?flat_settings
+----
+// TEST[continued]
+
+
+The reponse will look like this:
+
+[source,console-result]
+----
+{
+  "my-index-000001": {
+    "settings": {
+      "index.routing.allocation.include._tier_preference": "data_warm,data_hot" <1>
+    }
+  }
+}
+----
+// TESTRESPONSE[skip:the result is for illustrating purposes only]
+
+
+<1> Represents a comma separated list of data tier node roles this index is allowed
+to be allocated on, the first one in the list being the one with the higher priority
+i.e. the tier the index is targeting.
+e.g. in this example the tier preference is `data_warm,data_hot` so the index is
+targeting the `warm` tier and more nodes with the `data_warm` role are needed in
+the {es} cluster.
+
+
+Alternatively, if adding more nodes to the {es} cluster is not desired,
+inspect the <<dynamic-index-number-of-replicas,`index.number_of_replicas`>> index setting and
+decrease the configured value:
+
+
+. Inspect the <<dynamic-index-number-of-replicas,`index.number_of_replicas`>> index setting for the
+index with unassigned replica shards:
++
+[source,console]
+----
+GET /my-index-000001/_settings/index.number_of_replicas
+----
++
+The response will look like this:
++
+[source,console-result]
+----
+{
+  "my-index-000001" : {
+    "settings" : {
+      "index" : {
+        "number_of_replicas" : "2" <1>
+      }
+    }
+  }
+}
+----
+// TESTRESPONSE[skip:the result is for illustrating purposes only as don't want to change a cluster-wide setting]
+
++
+<1> Represents the currently configured value for the number of replica shards
+required for the index
+
+. Use the `_cat/nodes` API to find the number of nodes in the target tier:
++
+[source,console]
+----
+GET /_cat/nodes?h=node.role
+----
+// TEST[continued]
++
+The response will look like this, containing one row per node:
++
+[source,console-result]
+----
+himrst
+mv
+himrst
+----
+// TESTRESPONSE[skip:the result is for illustrating purposes only]
++
+You can count the rows containing the letter representing the target tier to know how many nodes you have. See
+<<cat-nodes-api-query-params>> for details. The example above has two rows containing `h`, so there are two
+nodes in the hot tier.
++
+. <<indices-update-settings, Decrease>> the
+value for the total number of replica shards required for this index. As 
+replica shards cannot reside on the same node as primary shards for
+<<high-availability-cluster-design,high availability>>, the new value needs to
+be less than or equal to the number of nodes found above minus one. Since the
+example above found 2 nodes in the hot tier,
+the maximum value for `index.number_of_replicas` is 1.
++
+[source,console]
+----
+PUT /my-index-000001/_settings
+{
+  "index" : {
+    "number_of_replicas" : 1 <1>
+  }
+}
+----
+// TEST[continued]
+
++
+<1> The new value for the `index.number_of_replicas` index configuration
+is decreased from the previous value of `2` to `1`. It can be set as low as 0
+but configuring it to 0 for indices other than <<searchable-snapshots,
+searchable snapshot indices>> may lead to temporary availability loss during
+node restarts
+or permanent data loss in case of data corruption.
+
+// end::self-managed[]
+

+ 5 - 3
docs/reference/troubleshooting.asciidoc

@@ -6,7 +6,7 @@
 This section provides a series of troubleshooting solutions aimed at helping users
 fix problems that an {es} deployment might encounter.
 
-Several troubleshooting issues can be diagnosed using the 
+Several troubleshooting issues can be diagnosed using the
 <<health-api,health API>>.
 
 If none of these solutions relate to your issue, you can still get help:
@@ -15,7 +15,7 @@ If none of these solutions relate to your issue, you can still get help:
 
 ** Go directly to the http://support.elastic.co[Support Portal]
 
-** From the {ess} Console, go to the 
+** From the {ess} Console, go to the
  https://cloud.elastic.co/support{ess-baymax}[Support page], or select the
  support icon that looks like a life preserver on any page.
 
@@ -29,7 +29,7 @@ registered email, you can also register a second email address with us. Just
 open a case to let us know the name and email address you want to add.
 ====
 
-* For users without an active subscription, visit the 
+* For users without an active subscription, visit the
 https://discuss.elastic.co/[Elastic community forums] and get answers from
 the experts in the community, including people from Elastic.
 --
@@ -46,6 +46,8 @@ include::troubleshooting/data/data-tiers-mixed-with-node-attr.asciidoc[]
 
 include::troubleshooting/data/diagnose-unassigned-shards.asciidoc[]
 
+include::troubleshooting/data/increase-tier-capacity.asciidoc[]
+
 include::monitoring/troubleshooting.asciidoc[]
 
 include::transform/troubleshooting.asciidoc[leveloffset=+1]

+ 21 - 0
docs/reference/troubleshooting/data/increase-tier-capacity.asciidoc

@@ -0,0 +1,21 @@
+[[increase-tier-capacity]]
+== Not enough nodes to allocate all shard replicas
+
+Distributing copies of the data (index shard replicas) on different nodes can 
+parallelize processing requests thus speeding up search queries. This can be 
+achieved by increasing the number of replica shards up to the maximum value
+(total number of nodes minus one) which also serves the purpose to protect
+against hardware failure. If the index has a preferred tier, Elasticsearch will
+only place the copies of the data for that index on nodes in the target tier.
+
+If a warning is encountered with not enough nodes to allocate all shard 
+replicas, you can influence this behavior by adding more nodes to the cluster
+(or tier if tiers are in use), or by reducing the
+<<dynamic-index-number-of-replicas,`index.number_of_replicas`>> index setting.
+
+In order to fix this follow the next steps:
+
+include::{es-repo-dir}/tab-widgets/troubleshooting/data/increase-tier-capacity-widget.asciidoc[]
+
+
+