123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119 |
- [[high-cpu-usage]]
- === High CPU usage
- {es} uses <<modules-threadpool,thread pools>> to manage CPU resources for
- concurrent operations. High CPU usage typically means one or more thread pools
- are running low.
- If a thread pool is depleted, {es} will <<rejected-requests,reject requests>>
- related to the thread pool. For example, if the `search` thread pool is
- depleted, {es} will reject search requests until more threads are available.
- You might experience high CPU usage if a <<data-tiers,data tier>>, and therefore the nodes assigned to that tier, is experiencing more traffic than other tiers. This imbalance in resource utilization is also known as <<hotspotting,hot spotting>>.
- [discrete]
- [[diagnose-high-cpu-usage]]
- ==== Diagnose high CPU usage
- **Check CPU usage**
- You can check the CPU usage per node using the <<cat-nodes,cat nodes API>>:
- // tag::cpu-usage-cat-nodes[]
- [source,console]
- ----
- GET _cat/nodes?v=true&s=cpu:desc
- ----
- The response's `cpu` column contains the current CPU usage as a percentage.
- The `name` column contains the node's name. Elevated but transient CPU usage is
- normal. However, if CPU usage is elevated for an extended duration, it should be
- investigated.
- To track CPU usage over time, we recommend enabling monitoring:
- include::{es-ref-dir}/tab-widgets/cpu-usage-widget.asciidoc[]
- **Check hot threads**
- If a node has high CPU usage, use the <<cluster-nodes-hot-threads,nodes hot
- threads API>> to check for resource-intensive threads running on the node.
- [source,console]
- ----
- GET _nodes/hot_threads
- ----
- // TEST[s/\/my-node,my-other-node//]
- This API returns a breakdown of any hot threads in plain text. High CPU usage
- frequently correlates to <<task-queue-backlog,a long-running task, or a
- backlog of tasks>>.
- [discrete]
- [[reduce-cpu-usage]]
- ==== Reduce CPU usage
- The following tips outline the most common causes of high CPU usage and their
- solutions.
- **Scale your cluster**
- Heavy indexing and search loads can deplete smaller thread pools. To better
- handle heavy workloads, add more nodes to your cluster or upgrade your existing
- nodes to increase capacity.
- **Spread out bulk requests**
- While more efficient than individual requests, large <<docs-bulk,bulk indexing>>
- or <<search-multi-search,multi-search>> requests still require CPU resources. If
- possible, submit smaller requests and allow more time between them.
- **Cancel long-running searches**
- Long-running searches can block threads in the `search` thread pool. To check
- for these searches, use the <<tasks,task management API>>.
- [source,console]
- ----
- GET _tasks?actions=*search&detailed
- ----
- The response's `description` contains the search request and its queries.
- `running_time_in_nanos` shows how long the search has been running.
- [source,console-result]
- ----
- {
- "nodes" : {
- "oTUltX4IQMOUUVeiohTt8A" : {
- "name" : "my-node",
- "transport_address" : "127.0.0.1:9300",
- "host" : "127.0.0.1",
- "ip" : "127.0.0.1:9300",
- "tasks" : {
- "oTUltX4IQMOUUVeiohTt8A:464" : {
- "node" : "oTUltX4IQMOUUVeiohTt8A",
- "id" : 464,
- "type" : "transport",
- "action" : "indices:data/read/search",
- "description" : "indices[my-index], search_type[QUERY_THEN_FETCH], source[{\"query\":...}]",
- "start_time_in_millis" : 4081771730000,
- "running_time_in_nanos" : 13991383,
- "cancellable" : true
- }
- }
- }
- }
- }
- ----
- // TESTRESPONSE[skip: no way to get tasks]
- To cancel a search and free up resources, use the API's `_cancel` endpoint.
- [source,console]
- ----
- POST _tasks/oTUltX4IQMOUUVeiohTt8A:464/_cancel
- ----
- For additional tips on how to track and avoid resource-intensive searches, see
- <<avoid-expensive-searches,Avoid expensive searches>>.
|