Loading

Troubleshoot shard capacity health issues

Stack

Simplify monitoring with AutoOps Serverless ECH ECK ECE Self-Managed

AutoOps is a monitoring tool that simplifies cluster management through performance recommendations, resource utilization visibility, and real-time issue detection with resolution paths. Learn more about AutoOps.

Elasticsearch limits the maximum number of shards to be held per node using the cluster.max_shards_per_node and cluster.max_shards_per_node.frozen settings. The current shards capacity of the cluster is available in the health API shards capacity section.

The cluster.max_shards_per_node cluster setting limits the maximum number of open shards for a cluster, only counting data nodes that do not belong to the frozen tier.

This symptom indicates that action should be taken, otherwise, either the creation of new indices or upgrading the cluster could be blocked.

If you're confident your changes won't destabilize the cluster, you can temporarily increase the limit using the cluster update settings API.

You can perform the following steps using either API console, or direct Elasticsearch API calls.

  1. Check the current status of the cluster according the shards capacity indicator:

    				GET _health_report/shards_capacity
    		

    The response will look like this:

    {
      "cluster_name": "...",
      "indicators": {
        "shards_capacity": {
          "status": "yellow",
          "symptom": "Cluster is close to reaching the configured maximum number of shards for data nodes.",
          "details": {
            "data": {
              "max_shards_in_cluster": 1000,
              "current_used_shards": 988
            },
            "frozen": {
              "max_shards_in_cluster": 3000,
              "current_used_shards": 0
            }
          },
          "impacts": [
            ...
          ],
          "diagnosis": [
            ...
        }
      }
    }
    		
    1. Current value of the setting cluster.max_shards_per_node
    2. Current number of open shards across the cluster
  2. Update the cluster.max_shards_per_node setting with a proper value using the cluster settings API:

    				PUT _cluster/settings
    					{
      "persistent" : {
        "cluster.max_shards_per_node": 1200
      }
    }
    		

    This increase should only be temporary. As a long-term solution, we recommend you add nodes to the oversharded data tier or reduce your cluster's shard count on nodes that do not belong to the frozen tier.

  3. To verify that the change has fixed the issue, check the current status of the shards_capacity indicator:

    				GET _health_report/shards_capacity
    		

    The response will look like this:

    {
      "cluster_name": "...",
      "indicators": {
        "shards_capacity": {
          "status": "green",
          "symptom": "The cluster has enough room to add new shards.",
          "details": {
            "data": {
              "max_shards_in_cluster": 1000
            },
            "frozen": {
              "max_shards_in_cluster": 3000
            }
          }
        }
      }
    }
    		
  4. When a long-term solution is in place, reset the cluster.max_shards_per_node limit:

    				PUT _cluster/settings
    					{
      "persistent" : {
        "cluster.max_shards_per_node": null
      }
    }
    		

The cluster.max_shards_per_node.frozen cluster setting limits the maximum number of open shards for a cluster, only counting data nodes that belong to the frozen tier.

This symptom indicates that action should be taken, otherwise, either the creation of new indices or upgrading the cluster could be blocked.

If you're confident your changes won't destabilize the cluster, you can temporarily increase the limit using the cluster update settings API.

You can perform the following steps using either API console, or direct Elasticsearch API calls.

  1. Check the current status of the cluster according the shards capacity indicator:

    				GET _health_report/shards_capacity
    		

    The response will look like this:

    {
      "cluster_name": "...",
      "indicators": {
        "shards_capacity": {
          "status": "yellow",
          "symptom": "Cluster is close to reaching the configured maximum number of shards for frozen nodes.",
          "details": {
            "data": {
              "max_shards_in_cluster": 1000
            },
            "frozen": {
              "max_shards_in_cluster": 3000,
              "current_used_shards": 2998
            }
          },
          "impacts": [
            ...
          ],
          "diagnosis": [
            ...
        }
      }
    }
    		
    1. Current value of the setting cluster.max_shards_per_node.frozen
    2. Current number of open shards used by frozen nodes across the cluster
  2. Update the cluster.max_shards_per_node.frozen setting using the cluster settings API:

    				PUT _cluster/settings
    					{
      "persistent" : {
        "cluster.max_shards_per_node.frozen": 3200
      }
    }
    		

    This increase should only be temporary. As a long-term solution, we recommend you add nodes to the oversharded data tier or reduce your cluster's shard count on nodes that belong to the frozen tier.

  3. To verify that the change has fixed the issue, check the current status of the shards_capacity indicator:

    				GET _health_report/shards_capacity
    		

    The response will look like this:

    {
      "cluster_name": "...",
      "indicators": {
        "shards_capacity": {
          "status": "green",
          "symptom": "The cluster has enough room to add new shards.",
          "details": {
            "data": {
              "max_shards_in_cluster": 1000
            },
            "frozen": {
              "max_shards_in_cluster": 3200
            }
          }
        }
      }
    }
    		
  4. When a long-term solution is in place, reset the cluster.max_shards_per_node.frozen limit:

    				PUT _cluster/settings
    					{
      "persistent" : {
        "cluster.max_shards_per_node.frozen": null
      }
    }