Index-level shard allocation
Elastic Stack Self Managed
In Elasticsearch, per-index settings allow you to control the allocation of shards to nodes through index-level shard allocation settings. These settings enable you to specify preferences or constraints for where shards of a particular index should reside. This includes allocating shards to nodes with specific attributes or avoiding certain nodes. This level of control helps optimize resource utilization, balance load, and ensure data redundancy and availability according to your deployment's specific requirements. For additional details, check out:
- Shard allocation filtering: Controlling which shards are allocated to which nodes.
- Delayed allocation: Delaying allocation of unassigned shards caused by a node leaving.
- Total shards per node: A hard limit on the number of shards from the same index per node.
- Data tier allocation: Controls the allocation of indices to data tiers.
You can use shard allocation filters to control where Elasticsearch allocates shards of a particular index. These per-index filters are applied in conjunction with cluster-wide allocation filtering and allocation awareness.
Shard allocation filters can be based on custom node attributes or the built-in _name
, _host_ip
, _publish_ip
, _ip
, _host
, _id
, _tier
and _tier_preference
attributes. Index lifecycle management uses filters based on custom node attributes to determine how to reallocate shards when moving between phases.
The cluster.routing.allocation
settings are dynamic, enabling existing indices to be moved immediately from one set of nodes to another. Shards are only relocated if it is possible to do so without breaking another routing constraint, such as never allocating a primary and replica shard on the same node.
For example, you could use a custom node attribute to indicate a node’s performance characteristics and use shard allocation filtering to route shards for a particular index to the most appropriate class of hardware.
To filter based on a custom node attribute:
Specify the filter characteristics with a custom node attribute in each node’s
elasticsearch.yml
configuration file. For example, if you havesmall
,medium
, andbig
nodes, you could add asize
attribute to filter based on node size.node.attr.size: medium
You can also set custom attributes when you start a node:
./bin/elasticsearch -Enode.attr.size=medium
Add a routing allocation filter to the index. The
index.routing.allocation
settings support three types of filters:include
,exclude
, andrequire
. For example, to tell Elasticsearch to allocate shards from thetest
index to eitherbig
ormedium
nodes, useindex.routing.allocation.include
:PUT test/_settings
{ "index.routing.allocation.include.size": "big,medium" }
If you specify multiple filters the following conditions must be satisfied simultaneously by a node in order for shards to be relocated to it:
- If any
require
type conditions are specified, all of them must be satisfied - If any
exclude
type conditions are specified, none of them may be satisfied - If any
include
type conditions are specified, at least one of them must be satisfied
For example, to move the
test
index tobig
nodes inrack1
, you could specify:PUT test/_settings
{ "index.routing.allocation.require.size": "big", "index.routing.allocation.require.rack": "rack1" }
- If any
Setting | Description |
---|---|
index.routing.allocation.include.{{attribute}} |
Assign the index to a node whose {{attribute}} has at least one of the comma-separated values. |
index.routing.allocation.require.{{attribute}} |
Assign the index to a node whose {{attribute}} has all of the comma-separated values. |
index.routing.allocation.exclude.{{attribute}} |
Assign the index to a node whose {{attribute}} has none of the comma-separated values. |
The index allocation settings support the following built-in attributes:
Attribute | Description |
---|---|
_name |
Match nodes by node name |
_host_ip |
Match nodes by host IP address (IP associated with hostname) |
_publish_ip |
Match nodes by publish IP address |
_ip |
Match either _host_ip or _publish_ip |
_host |
Match nodes by hostname |
_id |
Match nodes by node id |
_tier |
Match nodes by the node’s data tier role. For more details see data tier allocation filtering |
You can use wildcards when specifying attribute values, for example:
PUT test/_settings
{
"index.routing.allocation.include._ip": "192.168.2.*"
}
Unallocated shards are recovered in order of priority, whenever possible. Indices are sorted into priority order as follows:
- the optional
index.priority
setting (higher before lower) - the index creation date (higher before lower)
- the index name (higher before lower)
This means that, by default, newer indices will be recovered before older indices.
Use the per-index dynamically updatable index.priority
setting to customise the index prioritization order. For instance:
PUT index_1
PUT index_2
PUT index_3
{
"settings": {
"index.priority": 10
}
}
PUT index_4
{
"settings": {
"index.priority": 5
}
}
In the above example:
index_3
will be recovered first because it has the highestindex.priority
.index_4
will be recovered next because it has the next highest priority.index_2
will be recovered next because it was created more recently.index_1
will be recovered last.
This setting accepts an integer, and can be updated on a live index with the update index settings API:
PUT index_4/_settings
{
"index.priority": 1
}