Shared file system repository
This repository type is only available if you run Elasticsearch on your own hardware. See Manage snapshot repositories for other deployment methods.
Use a shared file system repository to store snapshots on a shared file system.
To register a shared file system repository, first mount the file system to the same location on all master and data nodes. Then add the file system’s path or parent directory to the path.repo setting in elasticsearch.yml for each master and data node. For running clusters, this requires a rolling restart of each node.
Supported path.repo values vary by platform:
Linux and macOS installations support Unix-style paths:
path:
repo:
- /mount/backups
- /mount/long_term_backups
After restarting each node, use Kibana or the create snapshot repository API to register the repository. When registering the repository, specify the file system’s path:
PUT _snapshot/my_fs_backup
{
"type": "fs",
"settings": {
"location": "/mount/backups/my_fs_backup_location"
}
}
If you specify a relative path, Elasticsearch resolves the path using the first value in the path.repo setting.
PUT _snapshot/my_fs_backup
{
"type": "fs",
"settings": {
"location": "my_fs_backup_location"
}
}
- The first value in the
path.reposetting is/mount/backups. This relative path,my_fs_backup_location, resolves to/mount/backups/my_fs_backup_location.
Clusters should only register a particular snapshot repository bucket once. If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.
This prevents multiple clusters from writing to the repository at the same time and corrupting the repository’s contents. It also prevents Elasticsearch from caching the repository’s contents, which means that changes made by other clusters will become visible straight away.
To register a file system repository as read-only using the create snapshot repository API, set the readonly parameter to true. Alternatively, you can register a URL repository for the file system.
PUT _snapshot/my_fs_backup
{
"type": "fs",
"settings": {
"location": "my_fs_backup_location",
"readonly": true
}
}
Windows installations support both DOS and Microsoft UNC paths. Escape any backslashes in the paths. For UNC paths, provide the server and share name as a prefix.
path:
repo:
- "E:\\Mount\\Backups"
- "\\\\MY_SERVER\\Mount\\Long_term_backups"
- DOS path
- UNC path
After restarting each node, use Kibana or the create snapshot repository API to register the repository. When registering the repository, specify the file system’s path:
PUT _snapshot/my_fs_backup
{
"type": "fs",
"settings": {
"location": "E:\\Mount\\Backups\\My_fs_backup_location"
}
}
If you specify a relative path, Elasticsearch resolves the path using the first value in the path.repo setting.
PUT _snapshot/my_fs_backup
{
"type": "fs",
"settings": {
"location": "My_fs_backup_location"
}
}
- The first value in the
path.reposetting isE:\Mount\Backups. This relative path,My_fs_backup_location, resolves toE:\Mount\Backups\My_fs_backup_location.
Clusters should only register a particular snapshot repository bucket once. If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.
This prevents multiple clusters from writing to the repository at the same time and corrupting the repository’s contents. It also prevents Elasticsearch from caching the repository’s contents, which means that changes made by other clusters will become visible straight away.
To register a file system repository as read-only using the create snapshot repository API, set the readonly parameter to true. Alternatively, you can register a URL repository for the file system.
PUT _snapshot/my_fs_backup
{
"type": "fs",
"settings": {
"location": "my_fs_backup_location",
"readonly": true
}
}
The fs repository type supports a number of settings to customize how data is stored, which may be specified when creating the repository.
Repository settings cover storage placement, snapshot data layout and compression, throughput limits, read-only mode, and the maximum number of snapshots. For a complete list of all shared file system repository settings, refer to Shared file system repository settings.
Elasticsearch interacts with a shared file system repository using the file system abstraction in your operating system. This means that every Elasticsearch node must be able to perform operations within the repository path such as creating, opening, and renaming files, and creating and listing directories, and operations performed by one node must be visible to other nodes as soon as they complete.
Check for common misconfigurations using the Verify snapshot repository API and the Repository analysis API. When the repository is properly configured, these APIs will complete successfully. If the verify repository or repository analysis APIs report a problem then you will be able to reproduce this problem outside Elasticsearch by performing similar operations on the file system directly.
If the verify repository or repository analysis APIs fail with an error indicating insufficient permissions then adjust the configuration of the repository within your operating system to give Elasticsearch an appropriate level of access. To reproduce such problems directly, perform the same operations as Elasticsearch in the same security context as the one in which Elasticsearch is running. For example, on Linux, use a command such as su to switch to the user as which Elasticsearch runs.
If the verify repository or repository analysis APIs fail with an error indicating that operations on one node are not immediately visible on another node then adjust the configuration of the repository within your operating system to address this problem. If your repository cannot be configured with strong enough visibility guarantees then it is not suitable for use as an Elasticsearch snapshot repository.
The verify repository and repository analysis APIs will also fail if the operating system returns any other kind of I/O error when accessing the repository. If this happens, address the cause of the I/O error reported by the operating system.
Many NFS implementations match accounts across nodes using their numeric user IDs (UIDs) and group IDs (GIDs) rather than their names. It is possible for Elasticsearch to run under an account with the same name (often elasticsearch) on each node, but for these accounts to have different numeric user or group IDs. If your shared file system uses NFS then ensure that every node is running with the same numeric UID and GID, or else update your NFS configuration to account for the variance in numeric IDs across nodes.
The linearizable register implementation for shared filesystem repositories is based around file locking. To perform a compare-and-exchange operation on a register, Elasticsearch first locks he underlying file and then writes the updated contents under the same lock. This ensures that the file has not changed in the meantime.