High JVM memory pressure
High JVM memory usage can degrade cluster performance and trigger circuit breaker errors. To prevent this, we recommend taking steps to reduce memory pressure if a node’s JVM memory usage consistently exceeds 85%.
Elasticsearch's JVM uses a G1GC garbage collector. Over time this causes the JVM heap usage metric to reflect a sawtooth pattern as shown in Understanding JVM heap memory. Due to this, we recommend focusing monitoring on the JVM memory pressure which gives a rolling average of old garbage collection and better represents the node's ongoing JVM responsiveness.
You can also use the nodes stats API to calculate the current JVM memory pressure for each node.
GET _nodes/stats?filter_path=nodes.*.name,nodes.*.jvm.mem.pools.old
You can calculate the memory pressure from the ratio of used_in_bytes to max_in_bytes. For example, you can store this output into nodes_stats.json and then using third-party tool JQ:
cat nodes_stats.json | jq -rc '.nodes[]|.name as $n|.jvm.mem.pools.old|{name:$n, memory_pressure:(100*.used_in_bytes/.max_in_bytes|round) }'
Elastic Cloud Hosted and Elastic Cloud Enterprise also include a JVM memory pressure indicator for each node in your cluster in your deployment overview as discussed under their memory pressure monitoring. These indicators turn red once JVM memory pressure reaches 75%.
As memory usage increases, garbage collection becomes more frequent and takes longer. You can track the frequency and length of garbage collection events in elasticsearch.log. For example, the following event states Elasticsearch spent more than 50% (21 seconds) of the last 40 seconds performing garbage collection.
[timestamp_short_interval_from_last][INFO ][o.e.m.j.JvmGcMonitorService] [node_id] [gc][number] overhead, spent [21s] collecting in the last [40s]
Garbage collection will also surface as part of the nodes hot threads API output OTHER_CPU as described in troubleshooting high CPU usage.
For optimal JVM performance, garbage collection (GC) should meet these criteria:
| GC type | Completion time | Frequency |
|---|---|---|
| Young GC | <50ms | ~once per 10 seconds |
| Old GC | <1s | ≤once per 10 minutes |
To determine the exact reason for the high JVM memory pressure, capture and review a heap dump of the JVM while its memory usage is high.
If you have an Elastic subscription, you can request Elastic's assistance reviewing this output. When doing so, kindly:
- Grant written permission for Elastic to review your uploaded heap dumps within the support case.
- Share this file only after receiving any necessary business approvals as it might contain private information. Files are handled according to Elastic's privacy statement.
- Share heap dumps through our secure Support Portal. If your files are too large to upload, you can request a secure URL in the support case.
- Share the garbage collector logs covering the same time period.
AutoOps is a monitoring tool that simplifies cluster management through performance recommendations, resource utilization visibility, and real-time issue detection with resolution paths. Learn more about AutoOps.
To track JVM over time, we recommend enabling monitoring
- (Recommend) Enable AutoOps
- Enable logs and metrics. When logs and metrics are enabled, monitoring information is visible on Kibana's Stack Monitoring page. You can also enable the JVM memory threshold alert to be notified about potential issues through email.
- From your deployment menu, view the Performance page's memory pressure troubleshooting charts.
- (Recommend) Enable AutoOps.
- Enable Elasticsearch monitoring. When logs and metrics are enabled, monitoring information is visible on Kibana's Stack Monitoring page. You can also enable the JVM memory threshold alert to be notified about potential issues through email.
This section contains some common suggestions for reducing JVM memory pressure.
This section contains some common suggestions for why JVM memory pressure can remain high in the background or respond non-linearly during performance issues.
Elasticsearch's JVM handles its own executables and can suffer severe performance degredation due to operating system swapping. We recommend disabling swap.
Per guide, you can attempt to disable swap on the Elasticsearch level with setting bootstrap.memory_lock. In response, Elasticsearch will attempt to set mlockall; however this may fail. To check the setting and its outcome, poll the node information API:
GET _nodes?filter_path=**.mlockall,**.memory_lock,nodes.*.name
For example, you can store this output into nodes.json and then using third-party tool JQ:
cat nodes.json | jq -rc '.nodes[]|.name as $n|{node:.name, memory_lock:.settings.bootstrap.memory_lock, mlockall:.process.mlockall}'
Elasticsearch recommends completely disabling swap on the operating system. This is because anything set Elasticsearch-level is best effort but swap can have severe impact on Elasticsearch performance. To check if any nodes are currently swapping, poll the nodes stats API:
GET _nodes/stats?filter_path=**.swap,nodes.*.name
For example, you can store this output into nodes_stats.json and then using third-party tool JQ:
cat nodes_stats.json | jq -rc '.nodes[]|{name:.name, swap_used:.os.swap.used_in_bytes}' | sort
If nodes are found to be swapping after attempting to disable on the Elasticsearch level, you need to escalate to disabling swap on the operating system level to avoid performance impact.
JVM performance strongly depends on having Compressed OOPs enabled. The exact max heap size cut off depends on operating system, but usually caps near 30GB. To check if it is enabled, poll the node information API:
GET _nodes?filter_path=nodes.*.name,nodes.*.jvm.using_compressed_ordinary_object_pointers
For example, you can store this output into nodes.json and then using third-party tool JQ:
cat nodes.json | jq -rc '.nodes[]|{node:.name, compressed:.jvm.using_compressed_ordinary_object_pointers}'
By default, Elasticsearch manages the JVM heap size. If manually overridden, Xms and Xmx should be equal and not more than half of total operating system RAM per Set the JVM heap size.
To check these heap settings, poll the node information API:
GET _nodes?filter_path=nodes.*.name,nodes.*.jvm.mem
For example, you can store this output into nodes.json and then using third-party tool JQ:
cat nodes.json | jq -rc '.nodes[]|.name as $n|.jvm.mem|{name:$n, heap_min:.heap_init, heap_max:.heap_max}'
Every shard uses memory. Usually a small set of large shards uses fewer resources than many small shards. For tips on reducing your shard count, refer to Size your shards.
This section contains some common suggestions for reducing JVM memory pressure related to traffic patterns.
Expensive searches can use large amounts of memory. To better track expensive searches on your cluster, enable slow logs.
Expensive searches may have a large size argument, use aggregations with a large number of buckets, or include expensive queries. To prevent expensive searches, consider the following setting changes:
- Lower the
sizelimit using theindex.max_result_windowindex setting. - Decrease the maximum number of allowed aggregation buckets using the search.max_buckets cluster setting.
- Disable expensive queries using the
search.allow_expensive_queriescluster setting. - Set a default search timeout using the
search.default_search_timeoutcluster setting.
PUT _settings
{
"index.max_result_window": 5000
}
PUT _cluster/settings
{
"persistent": {
"search.max_buckets": 20000,
"search.allow_expensive_queries": false
}
}
Defining too many fields or nesting fields too deeply can lead to mapping explosions that use large amounts of memory. To prevent mapping explosions, use the mapping limit settings to limit the number of field mappings.
You might also consider setting Kibana Advacned Settings data_views:fields_excluded_data_tiers for faster performance. For example, to avoid searchable snapshots on cold and frozen data tiers, you would set this to data_cold,data_frozen. This can help Discover fields to load faster as depicted in Troubleshooting guide: Solving 6 common issues in Kibana Discover load.
While more efficient than individual requests, large bulk indexing or multi-search requests can still create high JVM memory pressure. If possible, submit smaller requests and allow more time between them.
Heavy indexing and search loads can cause high JVM memory pressure. To better handle heavy workloads, upgrade your nodes to increase their memory capacity.