S3 repository
You can use AWS S3 as a repository for Snapshot/Restore.
If you are looking for a hosted solution of Elasticsearch on AWS, visit https://www.elastic.co/cloud/.
See this video for a walkthrough of connecting an AWS S3 repository.
To register an S3 repository, specify the type as s3 when creating the repository. The only mandatory repository setting is the bucket name:
PUT _snapshot/my_s3_repository
{
"type": "s3",
"settings": {
"bucket": "my-bucket"
}
}
Client settings describe how repositories select an S3 client and how authentication is configured. The PUT request above only specifies the s3 type and bucket, so it implicitly uses the default client, which, by default, also attempts to resolve credentials automatically from the environment if no explicit settings are provided. Client configuration is covered in the following section.
Elasticsearch communicates with S3 through an S3 client. Clients are configured through a combination of secure settings in the Elasticsearch keystore, and standard settings in elasticsearch.yml. Settings use the prefix s3.client.CLIENT_NAME plus a suffix such as access_key. The full set of client settings is listed under S3 repository client settings.
You can define several clients, each with its own settings, using the form s3.client.CLIENT_NAME.SETTING_NAME. When the environment is compatible, Elasticsearch also creates a client named default, and s3 repositories use that client unless the repository setting client selects another name.
default can use automatic credential discovery when available:
- On an EC2 instance, the EC2 Instance Metadata Service (IMDS) can provide temporary credentials for the instance IAM role.
- In an Amazon ECS task, Elasticsearch can use temporary task IAM role credentials.
- On Kubernetes, you can use service account-based authentication with the right setup in the node or pod.
If you do not want to rely on automatic credentials discovery, add explicit keys for a client in the Elasticsearch keystore. A typical choice for the built-in default client is:
s3.client.default.access_keys3.client.default.secret_keys3.client.default.session_token(optional)
S3 client settings cover authentication, region and endpoint selection, proxy/network configuration, and connection or retry tuning. For a complete list of all S3 client settings, refer to S3 repository client settings.
The s3 repository type supports a number of settings to customize how data is stored in S3. These can be specified when creating the repository. For example:
PUT _snapshot/my_s3_repository
{
"type": "s3",
"settings": {
"bucket": "my-bucket",
"client": "default"
}
}
clientmust refer to a configured S3 client. If omitted, thedefaultclient is used.
Available repository settings define storage placement, snapshot data handling, storage and encryption behavior, throughput limits, and multipart upload tuning. For a complete list of all S3 repository settings, refer to S3 repository settings.
Amazon S3 supports a variety of storage classes, each of which offers different operational characteristics. For instance, some classes cost less per byte stored per month, but cost more per request, and other classes may vary in terms of their availability guarantees.
You may specify the storage class that Elasticsearch uses to store data objects with the storage_class repository setting.
Changing the storage_class setting on an existing repository only affects the storage class for newly created objects, resulting in a mixed usage of storage classes.
You may use an S3 Lifecycle Policy to adjust the storage class of existing objects in your repository, but you must not transition objects to an unsupported class such as the Glacier classes, and you must not expire objects. If you use a Glacier storage class, or another unsupported storage class, or object expiry, then you may permanently lose access to your repository contents.
You may use the intelligent_tiering storage class to automatically manage the class of objects, but you must not enable the optional Archive Access or Deep Archive Access tiers. If you use these tiers then you may permanently lose access to your repository contents.
For more information about S3 storage classes, see AWS Storage Classes Guide.
In order to restrict the Elasticsearch snapshot process to the minimum required resources, we recommend using Amazon IAM in conjunction with pre-existing S3 buckets. Here is an example policy which will allow the snapshot access to an S3 bucket named "snaps.example.com". This may be configured through the AWS IAM console, by creating a Custom Policy, and using a Policy Document similar to this (changing snaps.example.com to your bucket name).
{
"Statement": [
{
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads",
"s3:ListBucketVersions"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::snaps.example.com"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::snaps.example.com/*"
]
}
],
"Version": "2012-10-17"
}
You may further restrict the permissions by specifying a prefix within the bucket, in this example, named "foo".
{
"Statement": [
{
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads",
"s3:ListBucketVersions"
],
"Condition": {
"StringLike": {
"s3:prefix": [
"foo/*"
]
}
},
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::snaps.example.com"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::snaps.example.com/foo/*"
]
}
],
"Version": "2012-10-17"
}
The bucket needs to exist to register a repository for snapshots. If you did not create the bucket then the repository registration will fail.
If you want to use Kubernetes service accounts for authentication, you need to add a symlink to the $AWS_WEB_IDENTITY_TOKEN_FILE environment variable (which should be automatically set by a Kubernetes pod) in the S3 repository config directory, so the repository can have the read access for the service account (a repository can't read any files outside its config directory). For example:
mkdir -p "${ES_PATH_CONF}/repository-s3"
ln -s $AWS_WEB_IDENTITY_TOKEN_FILE "${ES_PATH_CONF}/repository-s3/aws-web-identity-token-file"
The symlink must be created on all data and master eligible nodes and be readable by the elasticsearch user. By default, Elasticsearch runs as user elasticsearch using uid:gid 1000:0.
If the symlink exists, it will be used by default by all S3 repositories that don't have explicit client credentials.
AWS instances resolve S3 endpoints to a public IP. If the Elasticsearch instances reside in a private subnet in an AWS VPC then all traffic to S3 will go through the VPC's NAT instance. If your VPC's NAT instance is a smaller instance size (e.g. a t2.micro) or is handling a high volume of network traffic your bandwidth to S3 may be limited by that NAT instance's networking bandwidth limitations. Instead we recommend creating a VPC endpoint that enables connecting to S3 in instances that reside in a private subnet in an AWS VPC. This will eliminate any limitations imposed by the network bandwidth of your VPC's NAT instance.
Instances residing in a public subnet in an AWS VPC will connect to S3 via the VPC's internet gateway and not be bandwidth limited by the VPC's NAT instance.
AWS S3 supports replication of objects, both within a single region and across regions. However, this replication is not compatible with Elasticsearch snapshots.
The objects that Elasticsearch writes to the repository refer to other objects in the repository. Elasticsearch writes objects in a very specific order to ensure that each object only refers to objects which already exist. Likewise, Elasticsearch only deletes an object from the repository after it becomes unreferenced by all other objects. AWS S3 replication will apply operations to the replica repository in a different order from the order in which Elasticsearch applies them to the primary repository, which can cause some objects in replica repositories to refer to other objects that do not exist. This is an invalid state. It may not be possible to recover any data from a repository if it is in this state.
To replicate a repository's contents elsewhere, follow the repository backup process. In particular, you may use the point-in-time restore capability of AWS S3 backups to restore a backup of a snapshot repository to an earlier point in time.
There are a number of storage systems that provide an S3-compatible API, and the s3 repository type allows you to use these systems in place of AWS S3. To do so, you should set the s3.client.CLIENT_NAME.endpoint setting to the system's endpoint. This setting accepts IP addresses and hostnames and may include a port. For example, the endpoint may be 172.17.0.2 or 172.17.0.2:9000.
By default Elasticsearch communicates with your storage system using HTTPS, and validates the repository's certificate chain using the JVM-wide truststore. Ensure that the JVM-wide truststore includes an entry for your repository. If you wish to use unsecured HTTP communication instead of HTTPS, set s3.client.CLIENT_NAME.protocol to http.
There are many systems, including some from very well-known storage vendors, which claim to offer an S3-compatible API despite failing to emulate S3's behavior in full. If you are using such a system for your snapshots, consider using a shared filesystem repository based on a standardized protocol such as NFS to access your storage system instead. The s3 repository type requires full compatibility with S3. In particular it must support the same set of API endpoints, with the same parameters, return the same errors in case of failures, and offer consistency, performance, and reliability at least as good as S3 even when accessed concurrently by multiple nodes. You will need to work with the supplier of your storage system to address any incompatibilities you encounter. Don't report Elasticsearch issues involving storage systems which claim to be S3-compatible unless you can demonstrate that the same issue exists when using a genuine AWS S3 repository.
You can perform some basic checks of the suitability of your storage system using the repository analysis API. If this API does not complete successfully, or indicates poor performance, then your storage system is not fully compatible with AWS S3 and therefore unsuitable for use as a snapshot repository. However, a successful response from this API does not guarantee full compatibility, so you must also ensure that your storage supplier offers a full compatibility guarantee. When upgrading, always verify that your storage passes repository analysis in the upgraded version before upgrading any production clusters.
MinIO is an example of a storage system that provides an S3-compatible API. The s3 repository type allows Elasticsearch to work with MinIO-backed repositories as well as repositories stored on AWS S3. The Elasticsearch test suite includes some checks which aim to detect deviations in behavior between MinIO and AWS S3. Elastic will report directly to the MinIO project any deviations in behavior found by these checks. If you are running a version of MinIO whose behavior deviates from that of AWS S3 then you must upgrade your MinIO installation. If in doubt, please contact the MinIO support team for further information.
The performance, reliability, and durability of a MinIO-backed repository depend on the properties of the underlying infrastructure and on the details of your MinIO configuration. You must design your storage infrastructure and configure MinIO in a way that ensures your MinIO-backed repository has performance, reliability, and durability characteristics which match AWS S3 in order for it to be fully S3-compatible. If you need assistance with your MinIO configuration, please contact the MinIO support team.
Most storage systems can be configured to log the details of their interaction with Elasticsearch. If you are investigating a suspected incompatibility with AWS S3, it is usually simplest to collect these logs from your storage system and provide them to the supplier of your storage system for further analysis. Contact the supplier of your storage system for advice on how to configure it to log requests sufficiently verbosely for this troubleshooting.
If the incompatibility is not clear from the logs emitted by the storage system, you can enable more granular logging:
In Elasticsearch versions 9.1.0 to 9.1.8, and 9.2.0 to 9.2.2, it is not possible to obtain more detailed logs from the AWS Java SDK. Use the logs from the storage system itself, or upgrade to a later version of Elasticsearch.
Configure Elasticsearch to log every request it makes to the S3 API by setting the logging level of the software.amazon.awssdk.request logger to DEBUG:
PUT /_cluster/settings
{
"persistent": {
"logger.software.amazon.awssdk.request": "DEBUG"
}
}
To prevent leaking sensitive information such as credentials and keys in logs, Elasticsearch rejects configuring this logger at high verbosity unless insecure network trace logging is enabled. To do so, you must explicitly enable it on each node by setting the system property es.insecure_network_trace_enabled to true.
Collect the Elasticsearch logs covering the time period of the failed analysis from all nodes in your cluster and share them with the supplier of your storage system along with the analysis response so they can use them to determine the problem. Refer to Logging with the AWS S3 SDK for Java 2.x for further information, including details about other loggers that can be used to obtain even more verbose logs. When configuring other loggers, note that Elasticsearch configures the AWS Java SDK to use the ApacheHttpClient synchronous HTTP client.
Configure Elasticsearch to log every request it makes to the S3 API by setting the logging level of the com.amazonaws.request logger to DEBUG:
PUT /_cluster/settings
{
"persistent": {
"logger.com.amazonaws.request": "DEBUG"
}
}
To prevent leaking sensitive information such as credentials and keys in logs, Elasticsearch rejects configuring this logger at high verbosity unless insecure network trace logging is activated. To do so, you must explicitly configure it on each node by setting the system property es.insecure_network_trace_enabled to true.
Collect the Elasticsearch logs covering the time period of the failed analysis from all nodes in your cluster and share them with the supplier of your storage system along with the analysis response so they can use them to determine the problem. Refer to Logging AWS SDK for Java Calls for further information, including details about other loggers that can be used to obtain even more verbose logs.
When you have finished collecting the logs needed by your supplier, set the logger settings back to null to return to the default logging configuration and deactivate insecure network trace logging again. Refer to Logger and Cluster update settings for more information.
From 9.3.0 onwards the linearizable register implementation for S3 repositories is based on S3's conditional writes using the If-None-Match and If-Match request headers.
If your storage does not support conditional writes then it is not fully S3-compatible. However, if this is its only deviation in behavior from AWS S3 then it will work correctly with Elasticsearch as long as its multipart upload APIs have strongly consistent semantics, as described below. Future versions of Elasticsearch may remove this lenient behavior and require your storage to support conditional writes. Contact the supplier of your storage for further information about conditional writes and the strong consistency of your storage's multipart upload APIs.
In versions before 9.3.0, or if your storage does not support conditional writes, the linearizable register implementation for S3 repositories is based on the strongly consistent semantics of the multipart upload APIs. Elasticsearch first creates a multipart upload to indicate its intention to perform a linearizable register operation. Elasticsearch then lists and cancels all other multipart uploads for the same register. Elasticsearch then attempts to complete the upload. If the upload completes successfully then the compare-and-exchange operation was atomic.