123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335 |
- [[snapshots-register-repository]]
- == Register a snapshot repository
- ++++
- <titleabbrev>Register a repository</titleabbrev>
- ++++
- This guide shows you how to register a snapshot repository. A snapshot
- repository is an off-cluster storage location for your snapshots. You must
- register a repository before you can take or restore snapshots.
- In this guide, you’ll learn how to:
- * Register a snapshot repository
- * Verify that a repository is functional
- * Clean up a repository to remove unneeded files
- [discrete]
- [[snapshot-repo-prereqs]]
- === Prerequisites
- // tag::kib-snapshot-prereqs[]
- * To use {kib}'s **Snapshot and Restore** feature, you must have the following
- permissions:
- ** <<privileges-list-cluster,Cluster privileges>>: `monitor`, `manage_slm`,
- `cluster:admin/snapshot`, and `cluster:admin/repository`
- ** <<privileges-list-indices,Index privilege>>: `all` on the `monitor` index
- // end::kib-snapshot-prereqs[]
- include::apis/put-repo-api.asciidoc[tag=put-repo-api-prereqs]
- [discrete]
- [[snapshot-repo-considerations]]
- === Considerations
- When registering a snapshot repository, keep the following in mind:
- * Each snapshot repository is separate and independent. {es} doesn't share
- data between repositories.
- * {blank}
- +
- --
- // tag::multi-cluster-repo[]
- If you register the same snapshot repository with multiple clusters, only one
- cluster should have write access to the repository. On other clusters, register
- the repository as read-only.
- This prevents multiple clusters from writing to the repository at the same time
- and corrupting the repository’s contents.
- // end::multi-cluster-repo[]
- --
- * Use a different snapshot repository for each major version of {es}. Mixing
- snapshots from different major versions can corrupt a repository’s contents.
- [discrete]
- [[manage-snapshot-repos]]
- === Manage snapshot repositories
- You can register and manage snapshot repositories in two ways:
- * {kib}'s **Snapshot and Restore** feature
- * {es}'s <<snapshot-restore-repo-apis,snapshot repository management APIs>>
- To manage repositories in {kib}, go to the main menu and click **Stack
- Management** > **Snapshot and Restore** > **Repositories**. To register a
- snapshot repository, click **Register repository**.
- [discrete]
- [[snapshot-repo-types]]
- === Snapshot repository types
- Supported snapshot repository types vary based on your deployment type.
- [discrete]
- [[ess-repo-types]]
- ==== {ess} repository types
- {ess-trial}[{ess} deployments] automatically register the
- {cloud}/ec-snapshot-restore.html[`found-snapshots`] repository. {ess} uses this
- repository and the `cloud-snapshot-policy` to take periodic snapshots of your
- cluster. You can also use the `found-snapshots` repository for your own
- <<automate-snapshots-slm,{slm-init} policies>> or to store searchable snapshots.
- The `found-snapshots` repository is specific to each deployment. However, you
- can restore snapshots from another deployment's `found-snapshots` repository if
- the deployments are under the same account and in the same region. See
- {cloud}/ec_share_a_repository_across_clusters.html[Share a repository across
- clusters].
- {ess} deployments also support the following repository types:
- * {cloud}/ec-aws-custom-repository.html[AWS S3]
- * {cloud}/ec-gcs-snapshotting.html[Google Cloud Storage (GCS)]
- * {cloud}/ec-azure-snapshotting.html[Microsoft Azure]
- * <<snapshots-source-only-repository>>
- [discrete]
- [[self-managed-repo-types]]
- ==== Self-managed repository types
- If you run the {es} on your own hardware, you can use the following built-in
- snapshot repository types:
- * <<snapshots-filesystem-repository,Shared file system>>
- * <<snapshots-read-only-repository>>
- * <<snapshots-source-only-repository>>
- [[snapshots-repository-plugins]]
- Other repository types are available through official plugins:
- * {plugins}/repository-s3.html[AWS S3]
- * {plugins}/repository-gcs.html[Google Cloud Storage (GCS)]
- * {plugins}/repository-hdfs.html[Hadoop Distributed File System (HDFS)]
- * {plugins}/repository-azure.html[Microsoft Azure]
- You can also use alternative implementations of these repository types, such as
- MinIO, as long as they're compatible. To verify a repository's compatibility,
- see <<snapshots-repository-verification>>.
- [discrete]
- [[snapshots-filesystem-repository]]
- ==== Shared file system repository
- // tag::on-prem-repo-type[]
- NOTE: This repository type is only available if you run {es} on your own
- hardware. If you use {ess}, see <<ess-repo-types>>.
- // end::on-prem-repo-type[]
- Use a shared file system repository to store snapshots on a
- shared file system.
- To register a shared file system repository, first mount the file system to the
- same location on all master and data nodes. Then add the file system's
- path or parent directory to the `path.repo` setting in `elasticsearch.yml` for
- each master and data node. For running clusters, this requires a
- <<restart-cluster-rolling,rolling restart>> of each node.
- IMPORTANT: By default, a network file system (NFS) uses user IDs (UIDs) and
- group IDs (GIDs) to match accounts across nodes. If your shared file system is
- an NFS and your nodes don't use the same UIDs and GIDs, update your NFS
- configuration to account for this.
- Supported `path.repo` values vary by platform:
- include::{es-repo-dir}/tab-widgets/register-fs-repo-widget.asciidoc[]
- [discrete]
- [[snapshots-read-only-repository]]
- ==== Read-only URL repository
- include::register-repository.asciidoc[tag=on-prem-repo-type]
- You can use a URL repository to give a cluster read-only access to a shared file
- system. Since URL repositories are always read-only, they're a safer and more
- convenient alternative to registering a read-only shared filesystem repository.
- Use {kib} or the <<put-snapshot-repo-api,create snapshot repository API>> to
- register a URL repository.
- [source,console]
- ----
- PUT _snapshot/my_read_only_url_repository
- {
- "type": "url",
- "settings": {
- "url": "file:/mount/backups/my_fs_backup_location"
- }
- }
- ----
- // TEST[skip:no access to url file path]
- [discrete]
- [[snapshots-source-only-repository]]
- ==== Source-only repository
- You can use a source-only repository to take minimal, source-only snapshots that
- use up to 50% less disk space than regular snapshots.
- Unlike other repository types, a source-only repository doesn't directly store
- snapshots. It delegates storage to another registered snapshot repository.
- When you take a snapshot using a source-only repository, {es} creates a
- source-only snapshot in the delegated storage repository. This snapshot only
- contains stored fields and metadata. It doesn't include index or doc values
- structures and isn't immediately searchable when restored. To search the
- restored data, you first have to <<docs-reindex,reindex>> it into a new data
- stream or index.
- [IMPORTANT]
- ==================================================
- Source-only snapshots are only supported if the `_source` field is enabled and no source-filtering is applied.
- When you restore a source-only snapshot:
- * The restored index is read-only and can only serve `match_all` search or scroll requests to enable reindexing.
- * Queries other than `match_all` and `_get` requests are not supported.
- * The mapping of the restored index is empty, but the original mapping is available from the types top
- level `meta` element.
- ==================================================
- Before registering a source-only repository, use {kib} or the
- <<put-snapshot-repo-api,create snapshot repository API>> to register a snapshot
- repository of another type to use for storage. Then register the source-only
- repository and specify the delegated storage repository in the request.
- [source,console]
- ----
- PUT _snapshot/my_src_only_repository
- {
- "type": "source",
- "settings": {
- "delegate_type": "fs",
- "location": "my_backup_location"
- }
- }
- ----
- // TEST[continued]
- [discrete]
- [[snapshots-repository-verification]]
- === Verify a repository
- When you register a snapshot repository, {es} automatically verifies that the
- repository is available and functional on all master and data nodes.
- To disable this verification, set the <<put-snapshot-repo-api,create snapshot
- repository API>>'s `verify` query parameter to `false`. You can't disable
- repository verification in {kib}.
- [source,console]
- ----
- PUT _snapshot/my_unverified_backup?verify=false
- {
- "type": "fs",
- "settings": {
- "location": "my_unverified_backup_location"
- }
- }
- ----
- // TEST[continued]
- If wanted, you can manually run the repository verification check. To verify a
- repository in {kib}, go to the **Repositories** list page and click the name of
- a repository. Then click **Verify repository**. You can also use the
- <<verify-snapshot-repo-api,verify snapshot repository API>>.
- [source,console]
- ----
- POST _snapshot/my_unverified_backup/_verify
- ----
- // TEST[continued]
- If successful, the request returns a list of nodes used to verify the
- repository. If verification fails, the request returns an error.
- You can test a repository more thoroughly using the
- <<repo-analysis-api,repository analysis API>>.
- [discrete]
- [[snapshots-repository-cleanup]]
- === Clean up a repository
- Repositories can over time accumulate data that is not referenced by any existing snapshot. This is a result of the data safety guarantees
- the snapshot functionality provides in failure scenarios during snapshot creation and the decentralized nature of the snapshot creation
- process. This unreferenced data does in no way negatively impact the performance or safety of a snapshot repository but leads to higher
- than necessary storage use. To remove this unreferenced data, you can run a cleanup operation on the repository. This will
- trigger a complete accounting of the repository's contents and delete any unreferenced data.
- To run the repository cleanup operation in {kib}, go to the **Repositories**
- list page and click the name of a repository. Then click **Clean up
- repository**.
- You can also use the <<clean-up-snapshot-repo-api,clean up snapshot repository
- API>>.
- [source,console]
- ----
- POST _snapshot/my_repository/_cleanup
- ----
- // TEST[continued]
- The API returns:
- [source,console-result]
- ----
- {
- "results": {
- "deleted_bytes": 20,
- "deleted_blobs": 5
- }
- }
- ----
- Depending on the concrete repository implementation the numbers shown for bytes free as well as the number of blobs removed will either
- be an approximation or an exact result. Any non-zero value for the number of blobs removed implies that unreferenced blobs were found and
- subsequently cleaned up.
- Please note that most of the cleanup operations executed by this endpoint are automatically executed when deleting any snapshot from a
- repository. If you regularly delete snapshots, you will in most cases not get any or only minor space savings from using this functionality
- and should lower your frequency of invoking it accordingly.
- [discrete]
- [[snapshots-repository-backup]]
- === Back up a repository
- You may wish to make an independent backup of your repository, for instance so
- that you have an archive copy of its contents that you can use to recreate the
- repository in its current state at a later date.
- You must ensure that {es} does not write to the repository while you are taking
- the backup of its contents. You can do this by unregistering it, or registering
- it with `readonly: true`, on all your clusters. If {es} writes any data to the
- repository during the backup then the contents of the backup may not be
- consistent and it may not be possible to recover any data from it in future.
- Alternatively, if your repository supports it, you may take an atomic snapshot
- of the underlying filesystem and then take a backup of this filesystem
- snapshot. It is very important that the filesystem snapshot is taken
- atomically.
- WARNING: You cannot use filesystem snapshots of individual nodes as a backup
- mechanism. You must use the {es} snapshot and restore feature to copy the
- cluster contents to a separate repository. Then, if desired, you can take a
- filesystem snapshot of this repository.
- When restoring a repository from a backup, you must not register the repository
- with {es} until the repository contents are fully restored. If you alter the
- contents of a repository while it is registered with {es} then the repository
- may become unreadable or may silently lose some of its contents.
|