|
@@ -0,0 +1,140 @@
|
|
|
+// tag::cloud[]
|
|
|
+**Use {kib}**
|
|
|
+
|
|
|
+//tag::kibana-api-ex[]
|
|
|
+. Log in to the {ess-console}[{ecloud} console].
|
|
|
++
|
|
|
+
|
|
|
+. On the **Elasticsearch Service** panel, click the name of your deployment.
|
|
|
++
|
|
|
+
|
|
|
+NOTE: If the name of your deployment is disabled your {kib} instances might be
|
|
|
+unhealthy, in which case please contact https://support.elastic.co[Elastic Support].
|
|
|
+If your deployment doesn't include {kib}, all you need to do is
|
|
|
+{cloud}/ec-access-kibana.html[enable it first].
|
|
|
++
|
|
|
+. Open your deployment's side navigation menu (placed under the Elastic logo in the upper left corner)
|
|
|
+and go to **Stack Management > Index Management**.
|
|
|
+
|
|
|
+. In the list of all your indices, click the `Replicas` column twice to sort the indices based on their number of
|
|
|
+replicas starting with the one that has the most. Go through the indices and pick one by one the index with the
|
|
|
+least importance and higher number of replicas.
|
|
|
++
|
|
|
+WARNING: Reducing the replicas of an index can potentially reduce search throughput and data redundancy.
|
|
|
++
|
|
|
+. For each index you chose, click on its name, then on the panel that appears click `Edit settings`, reduce the
|
|
|
+value of the `index.number_of_replicas` to the desired value and then click `Save`.
|
|
|
++
|
|
|
+[role="screenshot"]
|
|
|
+image::images/troubleshooting/disk/reduce_replicas.png[Reducing replicas,align="center"]
|
|
|
++
|
|
|
+. Continue this process until the cluster is healthy again.
|
|
|
+
|
|
|
+// end::cloud[]
|
|
|
+
|
|
|
+// tag::self-managed[]
|
|
|
+In order to estimate how many replicas need to be removed, first you need to estimate the amount of disk space that
|
|
|
+needs to be released.
|
|
|
+
|
|
|
+. First, retrieve the relevant disk thresholds that will indicate how much space should be released. The
|
|
|
+relevant thresholds are the <<cluster-routing-watermark-high, high watermark>> for all the tiers apart from the frozen
|
|
|
+one and the <<cluster-routing-flood-stage-frozen, frozen flood stage watermark>> for the frozen tier. The following
|
|
|
+example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark:
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high*
|
|
|
+----
|
|
|
++
|
|
|
+The response will look like this:
|
|
|
++
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+{
|
|
|
+ "defaults": {
|
|
|
+ "cluster": {
|
|
|
+ "routing": {
|
|
|
+ "allocation": {
|
|
|
+ "disk": {
|
|
|
+ "watermark": {
|
|
|
+ "high": "90%",
|
|
|
+ "high.max_headroom": "150GB"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[skip:illustration purposes only]
|
|
|
++
|
|
|
+The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have
|
|
|
+more than 150GB available, read more on how this threshold works <<cluster-routing-watermark-high, here>>.
|
|
|
+
|
|
|
+. The next step is to find out the current disk usage; this will indicate how much space should be freed. For simplicity,
|
|
|
+our example has one node, but you can apply the same for every node over the relevant threshold.
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+GET _cat/allocation?v&s=disk.avail&h=node,disk.percent,disk.avail,disk.total,disk.used,disk.indices,shards
|
|
|
+----
|
|
|
++
|
|
|
+The response will look like this:
|
|
|
++
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+node disk.percent disk.avail disk.total disk.used disk.indices shards
|
|
|
+instance-0000000000 91 4.6gb 35gb 31.1gb 29.9gb 111
|
|
|
+----
|
|
|
+// TEST[skip:illustration purposes only]
|
|
|
+
|
|
|
+. The high watermark configuration indicates that the disk usage needs to drop below 90%. Consider allowing some
|
|
|
+padding, so the node will not go over the threshold in the near future. In this example, let's release approximately 7GB.
|
|
|
+
|
|
|
+. The next step is to list all the indices and choose which replicas to reduce.
|
|
|
++
|
|
|
+NOTE: The following command orders the indices with descending number of replicas and primary store size. We do this to
|
|
|
+help you choose which replicas to reduce under the assumption that the more replicas you have the smaller the risk if
|
|
|
+you remove a copy and the bigger the replica the more space will be released. This does not take into consideration any
|
|
|
+functional requirements, so please see it as a mere suggestion.
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+GET _cat/indices?v&s=rep:desc,pri.store.size:desc&h=health,index,pri,rep,store.size,pri.store.size
|
|
|
+----
|
|
|
++
|
|
|
+The response will look like:
|
|
|
++
|
|
|
+[source,console-result]
|
|
|
+----
|
|
|
+health index pri rep store.size pri.store.size
|
|
|
+green my_index 2 3 9.9gb 3.3gb
|
|
|
+green my_other_index 2 3 1.8gb 470.3mb
|
|
|
+green search-products 2 3 278.5kb 69.6kb
|
|
|
+green logs-000001 1 0 7.7gb 7.7gb
|
|
|
+----
|
|
|
+// TEST[skip:illustration purposes only]
|
|
|
++
|
|
|
+. In the list above we see that if we reduce the replicas to 1 of the indices `my_index` and `my_other_index` we will
|
|
|
+release the required disk space. It is not necessary to reduce the replicas of `search-products` and `logs-000001` does
|
|
|
+not have any replicas anyway. Reduce the replicas of one or more indices with the <<indices-update-settings,
|
|
|
+index update settings API>>:
|
|
|
++
|
|
|
+WARNING: Reducing the replicas of an index can potentially reduce search throughput and data redundancy.
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+PUT my_index,my_other_index/_settings
|
|
|
+{
|
|
|
+ "index.number_of_replicas": 1
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[skip:illustration purposes only]
|
|
|
+// end::self-managed[]
|
|
|
+
|
|
|
+IMPORTANT: After reducing the replicas please consider there are enough replicas to ensure your search
|
|
|
+performance and reliability requirements. If not, at your earliest convenience (i) consider using
|
|
|
+<<overview-index-lifecycle-management, Index Lifecycle Management>> to manage more efficiently the
|
|
|
+retention of your timeseries data, or (ii) reduce the amount of data you have by disabling the `source` or removing
|
|
|
+less important data, or (iii) increase your disk capacity.
|