Task queue backlog
A backlogged task queue can lead to rejected requests or an unhealthy cluster state. Contributing factors can include uneven or resource constrained hardware, a large number of tasks triggered at the same time, expensive tasks that are using high CPU or are inducing high JVM, and long-running tasks.
To identify the cause of the backlog, try these diagnostic actions.
- Check thread pool status
- Inspect node hot threads
- Identify long-running node tasks
- Look for long-running cluster tasks
- Monitor slow logs
Use the cat thread pool API to monitor active threads, queued tasks, rejections, and completed tasks:
GET /_cat/thread_pool?v&s=t,n&h=type,name,node_name,pool_size,active,queue_size,queue,rejected,completed
By way of explanation on these thread pool metrics:
- the
activeandqueuestatistics are point-in-time - the
rejectedandcompletedstatistics are cumulative from node start-up - the thread pool will fill
activeuntil it reaches thepool_sizeat which point it will fillqueueuntil it reaches thequeue_sizeafter which it will rejected requests
There are a number of things that you can check as potential causes for the queue backlog:
- Look for continually high
queuemetrics, which indicate long-running tasks or CPU-expensive tasks. - Look for bursts of elevated
queuemetrics, which indicate opportunities to spread traffic volume. - Determine whether thread pool issues are specific to a node role.
- Check whether a specific node is depleting faster than others within a data tier. This might indicate hot spotting.
If a particular thread pool queue is backed up, periodically poll the CPU-related API's to gauge task progression vs resource constraints:
-
GET /_nodes/hot_threads the cat nodes API
GET _cat/nodes?v=true&s=cpu:desc
If cpu is consistently elevated or a hot thread's stack trace does not rotate over an extended period, investigate high CPU usage.
Although the hot threads API response does not list the specific tasks running on a thread, it provides a summary of the thread’s activities. You can correlate a hot threads response with a task management API response to identify any overlap with specific tasks. For example, if hot threads suggest the node is spending time in search, filter the Task Management API for search tasks.
Long-running tasks can also cause a backlog. Use the task management API to check for excessive running_time_in_nanos values:
GET /_tasks?pretty=true&human=true&detailed=true
You can filter on a specific action, such as bulk indexing or search-related tasks. If investigating particular nodes, this API can be filtered to specific nodes.
Filter on bulk index actions:
GET /_tasks?human&detailed&actions=indices:*write*GET /_tasks?human&detailed&actions=indices:*write*&nodes=<YOUR_NODE_ID_OR_NAME_HERE>Filter on search actions:
GET /_tasks?human&detailed&actions=indices:*search*GET /_tasks?human&detailed&actions=indices:*search*&nodes=<YOUR_NODE_ID_OR_NAME_HERE>
Long-running tasks might need to be canceled.
Refer to this video for a walkthrough of how to troubleshoot the task management API output.
You can also check the Tune for search speed and Tune for indexing speed pages for more information.
Use the cat pending tasks API to identify delays in cluster state synchronization:
GET /_cat/pending_tasks?v=true
Cluster state synchronization can be expected to fall behind when a cluster is unstable, but otherwise this state usually indicates an unworkable cluster setting override or traffic pattern.
There are a few common source issues to check for:
ilm-: index lifecycle management (ILM) polls every10mby default, as determined by theindices.lifecycle.poll_intervalsetting. This starts asynchronous tasks executed by the node tasks. If ILM continually reports as a cluster pending task, this setting likely is being overridden. Otherwise, the cluster likely has misconfigured indices count relative to master heap size.put-mapping: Elasticsearch enables dynamic mapping by default. This, or the update mapping API, triggers a mapping update. In this case, the corresponding cluster log will contain anupdate_mappingentry with the name of the affected index.shard-started: Indicates active shard recoveries. Overridingcluster.routing.allocation.*settings can cause pending tasks and recoveries to back up.
If you're not present during an incident to investigate backlogged tasks, you might consider enabling slow logs to review later.
For example, you can review slow search logs later using the search profiler, so that time consuming requests can be optimized.
Per before, when task backlogs occur it is frequently due to
- a traffic volume spike
- expensive tasks that are causing high CPU
- long-running tasks
- hot spotting, particularly from uneven or resource constrained hardware
Many of these can be investigated in isolation as unintended traffic pattern or configuration changes. Refer to the following recommendations to address repeat or long standing symptoms.
If an individual task is causing a thread pool queue due to high CPU usage, try cancelling the task and then optimizing it before retrying.
This problem can surface due to a number of possible causes:
- Creating new tasks or modifying scheduled tasks which either run frequently or are broad in their effect, such as index lifecycle management policies or rules
- Performing traffic load testing
- Doing extended look-backs, especially across data tiers
- Searching or performing bulk updates to a high number of indices in a single request
If an active task’s hot thread shows no progress, consider canceling the task if it's flagged as cancellable.
If you consistently encounter cancellable tasks running longer than expected, you might consider reviewing:
- setting a
search.default_search_timeout - ensuring scroll requests are cleared in a timely manner
For example, you can use the task management API to identify and cancel searches that consume excessive CPU time.
GET _tasks?actions=*search&detailed
The response description contains the search request and its queries. The running_time_in_nanos parameter shows how long the search has been running.
{
"nodes" : {
"oTUltX4IQMOUUVeiohTt8A" : {
"name" : "my-node",
"transport_address" : "127.0.0.1:9300",
"host" : "127.0.0.1",
"ip" : "127.0.0.1:9300",
"tasks" : {
"oTUltX4IQMOUUVeiohTt8A:464" : {
"node" : "oTUltX4IQMOUUVeiohTt8A",
"id" : 464,
"type" : "transport",
"action" : "indices:data/read/search",
"description" : "indices[my-index], search_type[QUERY_THEN_FETCH], source[{\"query\":...}]",
"start_time_in_millis" : 4081771730000,
"running_time_in_nanos" : 13991383,
"cancellable" : true
}
}
}
}
}
To cancel this example search to free up resources, you would run:
POST _tasks/oTUltX4IQMOUUVeiohTt8A:464/_cancel
For additional tips on how to track and avoid resource-intensive searches, see Avoid expensive searches.
If a specific node’s thread pool is depleting faster than its data tier peers, try addressing uneven node resource utilization, also known as "hot spotting". For details about reparative actions you can take, such as rebalancing shards, refer to the Hot spotting troubleshooting documentation.
By default, Elasticsearch allocates processors equal to the number reported available by the operating system. You can override this behaviour by adjusting the value of node.processors, but this advanced setting should be configured only after you've performed load testing.
In some cases, you might need to increase the problematic thread pool size. For example, it might help to increase a stuck force_merge thread pool. If the setting is automatically calculated to 1 based on available CPU processors, then increasing the value to 2 is indicated in elasticsearch.yml, for example:
thread_pool.force_merge.size: 2