Manage snapshot repositories in Elastic Cloud Hosted
Elastic Cloud Hosted
Snapshot repositories allow you to back up and restore your Elasticsearch data efficiently. In Elastic Cloud Hosted, repositories are automatically registered and managed within your deployment, ensuring data security, long-term archiving, and seamless recovery.
By default, Elastic Cloud Hosted takes a snapshot of all the indices in your Elasticsearch cluster every 30 minutes. You can set a different snapshot interval if needed for your environment. You can also take snapshots on demand, without having to wait for the next interval. Taking a snapshot on demand does not affect the retention schedule for existing snapshots; it just adds an additional snapshot to the repository. This might be helpful if you are about to make a deployment change and you don’t have a current snapshot.
Use Kibana to manage your snapshots. In Kibana, you can set up additional repositories where the snapshots are stored, other than the one currently managed by Elastic Cloud Hosted. You can view and delete snapshots, and configure a snapshot lifecycle management (SLM) policy to automate when snapshots are created and deleted.
Snapshots back up only open indices. If you close an index, it is not included in snapshots and you will not be able to restore the data.
A snapshot taken using the default found-snapshots
repository can only be restored to deployments in the same region. If you need to restore snapshots across regions, create the destination deployment, connect to the custom repository, and then restore from a snapshot.
From within Elastic Cloud Hosted, you can restore a snapshot from a different deployment in the same region.
To use Kibana's Snapshot and Restore feature, you must have the following permissions:
- Cluster privileges:
monitor
,manage_slm
,cluster:admin/snapshot
, andcluster:admin/repository
- Index privilege:
all
on the monitor index
To register a snapshot repository, the cluster’s global metadata must be writable. Ensure there aren’t any cluster blocks that prevent write access.
When working with snapshot repositories in Elastic Cloud Hosted, keep the following in mind:
- Each snapshot repository is separate and independent. Elasticsearch doesn’t share data between repositories.
- Clusters should only register a particular snapshot repository bucket once. If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.
- This prevents multiple clusters from writing to the repository at the same time and corrupting the repository’s contents. It also prevents Elasticsearch from caching the repository’s contents, which means that changes made by other clusters will become visible straight away.
- When upgrading Elasticsearch to a newer version, you can continue to use the same repository you were using before the upgrade. If the repository is accessed by multiple clusters, they should all have the same version. Once a repository has been modified by a particular version of Elasticsearch, it may not work correctly when accessed by older versions. However, you will be able to recover from a failed upgrade by restoring a snapshot taken before the upgrade into a cluster running the pre-upgrade version, even if you have taken more snapshots during or after the upgrade.
Elastic Cloud Hosted deployments automatically register the found-snapshots
repository. Elastic Cloud Hosted uses this repository and the cloud-snapshot-policy
to take periodic snapshots of your cluster. You can also use the found-snapshots
repository for your own SLM policies or to store searchable snapshots.
The found-snapshots
repository is specific to each deployment. However, you can restore snapshots from another deployment’s found-snapshots
repository if the deployments are under the same account and in the same region. See the Cloud Snapshot and restore documentation to learn more.
Elastic Cloud Hosted deployments also support the following repository types:
In Elastic Cloud Hosted, snapshot repositories are automatically registered for you, but you can create additional repositories if needed.
- Kibana's Snapshot and Restore feature
- Elasticsearch's snapshot repository management APIs
To manage repositories in Kibana, go to the main menu and click Stack Management > Snapshot and Restore* > *Repositories. To register a snapshot repository, click Register repository.
You can also register a repository using the Create snapshot repository API.
When you register a snapshot repository, Elasticsearch automatically verifies that the repository is available and functional on all master and data nodes.
To disable this verification during repository creation, set the create snapshot repository API's verify
query parameter to false
. You can’t disable repository verification in Kibana.
PUT _snapshot/my_unverified_backup?verify=false
{
"type": "fs",
"settings": {
"location": "my_unverified_backup_location"
}
}
If wanted, you can manually run the repository verification check. To verify a repository in Kibana, go to the Repositories list page and click the name of a repository. Then click Verify repository. You can also use the verify snapshot repository API.
POST _snapshot/my_unverified_backup/_verify
If successful, the request returns a list of nodes used to verify the repository. If verification fails, the request returns an error.
You can test a repository more thoroughly using the repository analysis API.
Repositories can over time accumulate data that is not referenced by any existing snapshot. This is a result of the data safety guarantees the snapshot functionality provides in failure scenarios during snapshot creation and the decentralized nature of the snapshot creation process. This unreferenced data does in no way negatively impact the performance or safety of a snapshot repository but leads to higher than necessary storage use. To remove this unreferenced data, you can run a cleanup operation on the repository. This will trigger a complete accounting of the repository’s contents and delete any unreferenced data.
To run the repository cleanup operation in Kibana, go to the Repositories list page and click the name of a repository. Then click Clean up repository.
You can also use the clean up snapshot repository API.
POST _snapshot/my_repository/_cleanup
The API returns:
{
"results": {
"deleted_bytes": 20,
"deleted_blobs": 5
}
}
Depending on the concrete repository implementation the numbers shown for bytes free as well as the number of blobs removed will either be an approximation or an exact result. Any non-zero value for the number of blobs removed implies that unreferenced blobs were found and subsequently cleaned up.
Please note that most of the cleanup operations executed by this endpoint are automatically executed when deleting any snapshot from a repository. If you regularly delete snapshots, you will in most cases not get any or only minor space savings from using this functionality and should lower your frequency of invoking it accordingly.
You may wish to make an independent backup of your repository, for instance so that you have an archive copy of its contents that you can use to recreate the repository in its current state at a later date.
You must ensure that Elasticsearch does not write to the repository while you are taking the backup of its contents. If Elasticsearch writes any data to the repository during the backup then the contents of the backup may not be consistent and it may not be possible to recover any data from it in future. Prevent writes to the repository by unregistering the repository from the cluster which has write access to it.
Alternatively, if your repository supports it, you may take an atomic snapshot of the underlying filesystem and then take a backup of this filesystem snapshot. It is very important that the filesystem snapshot is taken atomically.
Do not rely on repository backups that were taken by methods other than the one described in this section. If you use another method to make a copy of your repository contents then the resulting copy may capture an inconsistent view of your data. Restoring a repository from such a copy may fail, reporting errors, or may succeed having silently lost some of your data.
Do not use filesystem snapshots of individual nodes as a backup mechanism. You must use the Elasticsearch snapshot and restore feature to copy the cluster contents to a separate repository. Then, if desired, you can take a filesystem snapshot of this repository.
When restoring a repository from a backup, you must not register the repository with Elasticsearch until the repository contents are fully restored. If you alter the contents of a repository while it is registered with Elasticsearch then the repository may become unreadable or may silently lose some of its contents. After restoring a repository from a backup, use the Verify repository integrity API to verify its integrity before you start to use the repository.