register-repository.asciidoc 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344
  1. [[snapshots-register-repository]]
  2. == Register a snapshot repository
  3. ++++
  4. <titleabbrev>Register a repository</titleabbrev>
  5. ++++
  6. This guide shows you how to register a snapshot repository. A snapshot
  7. repository is an off-cluster storage location for your snapshots. You must
  8. register a repository before you can take or restore snapshots.
  9. In this guide, you’ll learn how to:
  10. * Register a snapshot repository
  11. * Verify that a repository is functional
  12. * Clean up a repository to remove unneeded files
  13. [discrete]
  14. [[snapshot-repo-prereqs]]
  15. === Prerequisites
  16. // tag::kib-snapshot-prereqs[]
  17. * To use {kib}'s **Snapshot and Restore** feature, you must have the following
  18. permissions:
  19. ** <<privileges-list-cluster,Cluster privileges>>: `monitor`, `manage_slm`,
  20. `cluster:admin/snapshot`, and `cluster:admin/repository`
  21. ** <<privileges-list-indices,Index privilege>>: `all` on the `monitor` index
  22. // end::kib-snapshot-prereqs[]
  23. include::apis/put-repo-api.asciidoc[tag=put-repo-api-prereqs]
  24. [discrete]
  25. [[snapshot-repo-considerations]]
  26. === Considerations
  27. When registering a snapshot repository, keep the following in mind:
  28. * Each snapshot repository is separate and independent. {es} doesn't share
  29. data between repositories.
  30. * {blank}
  31. +
  32. --
  33. // tag::multi-cluster-repo[]
  34. If you register the same snapshot repository with multiple clusters, only one
  35. cluster should have write access to the repository. On other clusters, register
  36. the repository as read-only.
  37. This prevents multiple clusters from writing to the repository at the same time
  38. and corrupting the repository’s contents. It also prevents {es} from caching the
  39. repository's contents, which means that changes made by other clusters will
  40. become visible straight away.
  41. // end::multi-cluster-repo[]
  42. --
  43. * Use a different snapshot repository for each major version of {es}. Mixing
  44. snapshots from different major versions can corrupt a repository’s contents.
  45. [discrete]
  46. [[manage-snapshot-repos]]
  47. === Manage snapshot repositories
  48. You can register and manage snapshot repositories in two ways:
  49. * {kib}'s **Snapshot and Restore** feature
  50. * {es}'s <<snapshot-restore-repo-apis,snapshot repository management APIs>>
  51. To manage repositories in {kib}, go to the main menu and click **Stack
  52. Management** > **Snapshot and Restore** > **Repositories**. To register a
  53. snapshot repository, click **Register repository**.
  54. [discrete]
  55. [[snapshot-repo-types]]
  56. === Snapshot repository types
  57. Supported snapshot repository types vary based on your deployment type.
  58. [discrete]
  59. [[ess-repo-types]]
  60. ==== {ess} repository types
  61. {ess-trial}[{ess} deployments] automatically register the
  62. {cloud}/ec-snapshot-restore.html[`found-snapshots`] repository. {ess} uses this
  63. repository and the `cloud-snapshot-policy` to take periodic snapshots of your
  64. cluster. You can also use the `found-snapshots` repository for your own
  65. <<automate-snapshots-slm,{slm-init} policies>> or to store searchable snapshots.
  66. The `found-snapshots` repository is specific to each deployment. However, you
  67. can restore snapshots from another deployment's `found-snapshots` repository if
  68. the deployments are under the same account and in the same region. See
  69. {cloud}/ec_share_a_repository_across_clusters.html[Share a repository across
  70. clusters].
  71. {ess} deployments also support the following repository types:
  72. * {cloud}/ec-aws-custom-repository.html[AWS S3]
  73. * {cloud}/ec-gcs-snapshotting.html[Google Cloud Storage (GCS)]
  74. * {cloud}/ec-azure-snapshotting.html[Microsoft Azure]
  75. * <<snapshots-source-only-repository>>
  76. [discrete]
  77. [[self-managed-repo-types]]
  78. ==== Self-managed repository types
  79. If you run the {es} on your own hardware, you can use the following built-in
  80. snapshot repository types:
  81. * <<repository-s3,AWS S3>>
  82. * <<repository-gcs,Google Cloud Storage>>
  83. * <<repository-azure,Azure>>
  84. * <<snapshots-filesystem-repository,Shared file system>>
  85. * <<snapshots-read-only-repository>>
  86. * <<snapshots-source-only-repository>>
  87. [[snapshots-repository-plugins]]
  88. Other repository types are available through official plugins:
  89. * {plugins}/repository-hdfs.html[Hadoop Distributed File System (HDFS)]
  90. You can also use alternative implementations of these repository types, such as
  91. MinIO, as long as they're compatible. To verify a repository's compatibility,
  92. see <<snapshots-repository-verification>>.
  93. [discrete]
  94. [[snapshots-filesystem-repository]]
  95. ==== Shared file system repository
  96. // tag::on-prem-repo-type[]
  97. NOTE: This repository type is only available if you run {es} on your own
  98. hardware. If you use {ess}, see <<ess-repo-types>>.
  99. // end::on-prem-repo-type[]
  100. Use a shared file system repository to store snapshots on a
  101. shared file system.
  102. To register a shared file system repository, first mount the file system to the
  103. same location on all master and data nodes. Then add the file system's
  104. path or parent directory to the `path.repo` setting in `elasticsearch.yml` for
  105. each master and data node. For running clusters, this requires a
  106. <<restart-cluster-rolling,rolling restart>> of each node.
  107. IMPORTANT: By default, a network file system (NFS) uses user IDs (UIDs) and
  108. group IDs (GIDs) to match accounts across nodes. If your shared file system is
  109. an NFS and your nodes don't use the same UIDs and GIDs, update your NFS
  110. configuration to account for this.
  111. Supported `path.repo` values vary by platform:
  112. include::{es-repo-dir}/tab-widgets/register-fs-repo-widget.asciidoc[]
  113. [discrete]
  114. [[snapshots-read-only-repository]]
  115. ==== Read-only URL repository
  116. include::register-repository.asciidoc[tag=on-prem-repo-type]
  117. You can use a URL repository to give a cluster read-only access to a shared file
  118. system. Since URL repositories are always read-only, they're a safer and more
  119. convenient alternative to registering a read-only shared filesystem repository.
  120. Use {kib} or the <<put-snapshot-repo-api,create snapshot repository API>> to
  121. register a URL repository.
  122. [source,console]
  123. ----
  124. PUT _snapshot/my_read_only_url_repository
  125. {
  126. "type": "url",
  127. "settings": {
  128. "url": "file:/mount/backups/my_fs_backup_location"
  129. }
  130. }
  131. ----
  132. // TEST[skip:no access to url file path]
  133. [discrete]
  134. [[snapshots-source-only-repository]]
  135. ==== Source-only repository
  136. You can use a source-only repository to take minimal, source-only snapshots that
  137. use up to 50% less disk space than regular snapshots.
  138. Unlike other repository types, a source-only repository doesn't directly store
  139. snapshots. It delegates storage to another registered snapshot repository.
  140. When you take a snapshot using a source-only repository, {es} creates a
  141. source-only snapshot in the delegated storage repository. This snapshot only
  142. contains stored fields and metadata. It doesn't include index or doc values
  143. structures and isn't immediately searchable when restored. To search the
  144. restored data, you first have to <<docs-reindex,reindex>> it into a new data
  145. stream or index.
  146. [IMPORTANT]
  147. ==================================================
  148. Source-only snapshots are only supported if the `_source` field is enabled and no source-filtering is applied.
  149. When you restore a source-only snapshot:
  150. * The restored index is read-only and can only serve `match_all` search or scroll requests to enable reindexing.
  151. * Queries other than `match_all` and `_get` requests are not supported.
  152. * The mapping of the restored index is empty, but the original mapping is available from the types top
  153. level `meta` element.
  154. ==================================================
  155. Before registering a source-only repository, use {kib} or the
  156. <<put-snapshot-repo-api,create snapshot repository API>> to register a snapshot
  157. repository of another type to use for storage. Then register the source-only
  158. repository and specify the delegated storage repository in the request.
  159. [source,console]
  160. ----
  161. PUT _snapshot/my_src_only_repository
  162. {
  163. "type": "source",
  164. "settings": {
  165. "delegate_type": "fs",
  166. "location": "my_backup_location"
  167. }
  168. }
  169. ----
  170. // TEST[continued]
  171. [discrete]
  172. [[snapshots-repository-verification]]
  173. === Verify a repository
  174. When you register a snapshot repository, {es} automatically verifies that the
  175. repository is available and functional on all master and data nodes.
  176. To disable this verification, set the <<put-snapshot-repo-api,create snapshot
  177. repository API>>'s `verify` query parameter to `false`. You can't disable
  178. repository verification in {kib}.
  179. [source,console]
  180. ----
  181. PUT _snapshot/my_unverified_backup?verify=false
  182. {
  183. "type": "fs",
  184. "settings": {
  185. "location": "my_unverified_backup_location"
  186. }
  187. }
  188. ----
  189. // TEST[continued]
  190. If wanted, you can manually run the repository verification check. To verify a
  191. repository in {kib}, go to the **Repositories** list page and click the name of
  192. a repository. Then click **Verify repository**. You can also use the
  193. <<verify-snapshot-repo-api,verify snapshot repository API>>.
  194. [source,console]
  195. ----
  196. POST _snapshot/my_unverified_backup/_verify
  197. ----
  198. // TEST[continued]
  199. If successful, the request returns a list of nodes used to verify the
  200. repository. If verification fails, the request returns an error.
  201. You can test a repository more thoroughly using the
  202. <<repo-analysis-api,repository analysis API>>.
  203. [discrete]
  204. [[snapshots-repository-cleanup]]
  205. === Clean up a repository
  206. Repositories can over time accumulate data that is not referenced by any existing snapshot. This is a result of the data safety guarantees
  207. the snapshot functionality provides in failure scenarios during snapshot creation and the decentralized nature of the snapshot creation
  208. process. This unreferenced data does in no way negatively impact the performance or safety of a snapshot repository but leads to higher
  209. than necessary storage use. To remove this unreferenced data, you can run a cleanup operation on the repository. This will
  210. trigger a complete accounting of the repository's contents and delete any unreferenced data.
  211. To run the repository cleanup operation in {kib}, go to the **Repositories**
  212. list page and click the name of a repository. Then click **Clean up
  213. repository**.
  214. You can also use the <<clean-up-snapshot-repo-api,clean up snapshot repository
  215. API>>.
  216. [source,console]
  217. ----
  218. POST _snapshot/my_repository/_cleanup
  219. ----
  220. // TEST[continued]
  221. The API returns:
  222. [source,console-result]
  223. ----
  224. {
  225. "results": {
  226. "deleted_bytes": 20,
  227. "deleted_blobs": 5
  228. }
  229. }
  230. ----
  231. Depending on the concrete repository implementation the numbers shown for bytes free as well as the number of blobs removed will either
  232. be an approximation or an exact result. Any non-zero value for the number of blobs removed implies that unreferenced blobs were found and
  233. subsequently cleaned up.
  234. Please note that most of the cleanup operations executed by this endpoint are automatically executed when deleting any snapshot from a
  235. repository. If you regularly delete snapshots, you will in most cases not get any or only minor space savings from using this functionality
  236. and should lower your frequency of invoking it accordingly.
  237. [discrete]
  238. [[snapshots-repository-backup]]
  239. === Back up a repository
  240. You may wish to make an independent backup of your repository, for instance so
  241. that you have an archive copy of its contents that you can use to recreate the
  242. repository in its current state at a later date.
  243. You must ensure that {es} does not write to the repository while you are taking
  244. the backup of its contents. You can do this by unregistering it, or registering
  245. it with `readonly: true`, on all your clusters. If {es} writes any data to the
  246. repository during the backup then the contents of the backup may not be
  247. consistent and it may not be possible to recover any data from it in future.
  248. Alternatively, if your repository supports it, you may take an atomic snapshot
  249. of the underlying filesystem and then take a backup of this filesystem
  250. snapshot. It is very important that the filesystem snapshot is taken
  251. atomically.
  252. WARNING: You cannot use filesystem snapshots of individual nodes as a backup
  253. mechanism. You must use the {es} snapshot and restore feature to copy the
  254. cluster contents to a separate repository. Then, if desired, you can take a
  255. filesystem snapshot of this repository.
  256. When restoring a repository from a backup, you must not register the repository
  257. with {es} until the repository contents are fully restored. If you alter the
  258. contents of a repository while it is registered with {es} then the repository
  259. may become unreadable or may silently lose some of its contents.
  260. include::repository-s3.asciidoc[]
  261. include::repository-gcs.asciidoc[]
  262. include::repository-azure.asciidoc[]