Loading

Elasticsearch scaling considerations

ECE ECK Elastic Cloud Hosted Self Managed

Knowing when and how to scale your deployment is critical, especially when unexpected workloads hit. Adding more nodes or adjusting resources is not always the best solution. Instead, scaling should be based on real workload patterns and informed decision-making.

In orchestrated or managed deployments, Autoscaling can automatically adjust cluster resources based on demand, reducing operational overhead. However, in self-managed environments, scaling is a manual process, requiring careful planning to adapt to workload changes and ensure the cluster remains performant and resilient.

In orchestrated or managed deployments, Autoscaling can automatically adjust cluster resources based on demand, reducing operational overhead. However, if you choose not to use it, or in self-managed environments, scaling becomes a manual process that requires careful planning to adapt to workload changes and ensure the cluster remains performant and resilient.

Note

In Elastic Cloud Serverless projects, Elastic manages all scaling and performance tuning automatically. You don't need to configure nodes, resources, or autoscaling parameters.

Refer to Sizing Elasticsearch: Scaling up and out to identify which questions to ask yourself when determining which cluster size is the best fit for your Elasticsearch use case.

To make informed scaling decisions, cluster monitoring is essential. Metrics such as CPU usage, memory pressure, disk I/O, query response times, and shard distribution provide insights into when scaling may be necessary.

Scaling isn’t just about adding more nodes—it also involves optimizing the cluster configuration for better performance. Adjustments such as shard and index tuning, query optimizations, caching strategies, and efficient resource allocation can improve performance without requiring additional hardware. These optimizations directly influence scaling strategies, because a well-tuned cluster can handle more workload with fewer resources.

When adding zones for fault tolerance or high availability (HA), it might seem like you’re also scaling up. While additional zones might improve the performance, they should not be relied upon for additional capacity.

In Elastic Cloud Hosted and Elastic Cloud Enterprise, the concept of zones is intended for:

  • High availability (two zones)
  • Fault tolerance (three zones)

Neither will work if the cluster relies on the resources from those zones to be operational.

For true HA at the zone level, the recommended approach is to first scale up resources within a single zone until the cluster can take the full load (add some buffer to be prepared for a peak of requests), then scale out by adding additional zones depending on your requirements: two zones for high availability, three zones for fault tolerance.

Although the previous is the general recommendation, you should design your cluster to best support your HA requirements. Just make sure you fully understand the implications of your choices and plan accordingly.

Scaling Elasticsearch depends on how you deploy it. Refer to the appropriate guides below based on your deployment type.

These platforms offer built-in autoscaling and flexible resource management:

In self-managed environments, scaling requires manual intervention: