fix-common-cluster-issues.asciidoc 2.9 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
  1. [[fix-common-cluster-issues]]
  2. == Fix common cluster issues
  3. This guide describes how to fix common errors and problems with {es} clusters.
  4. ****
  5. If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
  6. ****
  7. <<fix-watermark-errors,Watermark errors>>::
  8. Fix watermark errors that occur when a data node is critically low on disk space
  9. and has reached the flood-stage disk usage watermark.
  10. <<circuit-breaker-errors,Circuit breaker errors>>::
  11. {es} uses circuit breakers to prevent nodes from running out of JVM heap memory.
  12. If Elasticsearch estimates an operation would exceed a circuit breaker, it stops
  13. the operation and returns an error.
  14. <<high-cpu-usage,High CPU usage>>::
  15. The most common causes of high CPU usage and their solutions.
  16. <<high-jvm-memory-pressure,High JVM memory pressure>>::
  17. High JVM memory usage can degrade cluster performance and trigger circuit
  18. breaker errors.
  19. <<red-yellow-cluster-status,Red or yellow cluster status>>::
  20. A red or yellow cluster status indicates one or more shards are missing or
  21. unallocated. These unassigned shards increase your risk of data loss and can
  22. degrade cluster performance.
  23. <<rejected-requests,Rejected requests>>::
  24. When {es} rejects a request, it stops the operation and returns an error with a
  25. `429` response code.
  26. <<task-queue-backlog,Task queue backlog>>::
  27. A backlogged task queue can prevent tasks from completing and put the cluster
  28. into an unhealthy state.
  29. <<diagnose-unassigned-shards,Diagnose unassigned shards>>::
  30. There are multiple reasons why shards might get unassigned, ranging from
  31. misconfigured allocation settings to lack of disk space.
  32. <<cluster-fault-detection-troubleshooting,Troubleshooting an unstable cluster>>::
  33. A cluster in which nodes leave unexpectedly is unstable and can create several
  34. issues.
  35. <<mapping-explosion,Mapping explosion>>::
  36. A cluster in which an index or index pattern as exploded with a high count of
  37. mapping fields which causes performance look-up issues for Elasticsearch and
  38. Kibana.
  39. <<hotspotting,Hot spotting>>::
  40. Hot spotting may occur in {es} when resource utilizations are unevenly
  41. distributed across nodes.
  42. include::common-issues/disk-usage-exceeded.asciidoc[]
  43. include::common-issues/circuit-breaker-errors.asciidoc[]
  44. include::common-issues/high-cpu-usage.asciidoc[]
  45. include::common-issues/high-jvm-memory-pressure.asciidoc[]
  46. include::common-issues/red-yellow-cluster-status.asciidoc[]
  47. include::common-issues/rejected-requests.asciidoc[]
  48. include::common-issues/task-queue-backlog.asciidoc[]
  49. include::common-issues/mapping-explosion.asciidoc[]
  50. include::common-issues/hotspotting.asciidoc[]
  51. include::common-issues/diagnose-unassigned-shards.asciidoc[]