register-repository.asciidoc 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276
  1. [[snapshots-register-repository]]
  2. == Register a snapshot repository
  3. ++++
  4. <titleabbrev>Register a repository</titleabbrev>
  5. ++++
  6. This guide shows you how to register a snapshot repository. A snapshot
  7. repository is an off-cluster storage location for your snapshots. You must
  8. register a repository before you can take or restore snapshots.
  9. In this guide, you’ll learn how to:
  10. * Register a snapshot repository
  11. * Verify that a repository is functional
  12. * Clean up a repository to remove unneeded files
  13. [discrete]
  14. [[snapshot-repo-prereqs]]
  15. === Prerequisites
  16. // tag::kib-snapshot-prereqs[]
  17. * To use {kib}'s **Snapshot and Restore** feature, you must have the following
  18. permissions:
  19. ** <<privileges-list-cluster,Cluster privileges>>: `monitor`, `manage_slm`,
  20. `cluster:admin/snapshot`, and `cluster:admin/repository`
  21. ** <<privileges-list-indices,Index privilege>>: `all` on the `monitor` index
  22. // end::kib-snapshot-prereqs[]
  23. include::apis/put-repo-api.asciidoc[tag=put-repo-api-prereqs]
  24. [discrete]
  25. [[snapshot-repo-considerations]]
  26. === Considerations
  27. When registering a snapshot repository, keep the following in mind:
  28. * Each snapshot repository is separate and independent. {es} doesn't share
  29. data between repositories.
  30. * {blank}
  31. +
  32. --
  33. // tag::multi-cluster-repo[]
  34. If you register the same snapshot repository with multiple clusters, only one
  35. cluster should have write access to the repository. On other clusters, register
  36. the repository as read-only.
  37. This prevents multiple clusters from writing to the repository at the same time
  38. and corrupting the repository’s contents. It also prevents {es} from caching the
  39. repository's contents, which means that changes made by other clusters will
  40. become visible straight away.
  41. // end::multi-cluster-repo[]
  42. --
  43. * When upgrading {es} to a newer version you can continue to use the same
  44. repository you were using before the upgrade. If the repository is accessed by
  45. multiple clusters, they should all have the same version. Once a repository has
  46. been modified by a particular version of {es}, it may not work correctly when
  47. accessed by older versions. However, you will be able to recover from a failed
  48. upgrade by restoring a snapshot taken before the upgrade into a cluster running
  49. the pre-upgrade version, even if you have taken more snapshots during or after
  50. the upgrade.
  51. [discrete]
  52. [[manage-snapshot-repos]]
  53. === Manage snapshot repositories
  54. You can register and manage snapshot repositories in two ways:
  55. * {kib}'s **Snapshot and Restore** feature
  56. * {es}'s <<snapshot-restore-repo-apis,snapshot repository management APIs>>
  57. To manage repositories in {kib}, go to the main menu and click **Stack
  58. Management** > **Snapshot and Restore** > **Repositories**. To register a
  59. snapshot repository, click **Register repository**.
  60. You can also register a repository using the <<put-snapshot-repo-api,Create
  61. snapshot repository API>>.
  62. [discrete]
  63. [[snapshot-repo-types]]
  64. === Snapshot repository types
  65. Supported snapshot repository types vary based on your deployment type:
  66. * <<ess-repo-types>>
  67. * <<self-managed-repo-types>>
  68. [discrete]
  69. [[ess-repo-types]]
  70. ==== {ess} repository types
  71. {ess-trial}[{ess} deployments] automatically register the
  72. {cloud}/ec-snapshot-restore.html[`found-snapshots`] repository. {ess} uses this
  73. repository and the `cloud-snapshot-policy` to take periodic snapshots of your
  74. cluster. You can also use the `found-snapshots` repository for your own
  75. <<automate-snapshots-slm,{slm-init} policies>> or to store searchable snapshots.
  76. The `found-snapshots` repository is specific to each deployment. However, you
  77. can restore snapshots from another deployment's `found-snapshots` repository if
  78. the deployments are under the same account and in the same region. See
  79. {cloud}/ec_share_a_repository_across_clusters.html[Share a repository across
  80. clusters].
  81. {ess} deployments also support the following repository types:
  82. * {cloud}/ec-azure-snapshotting.html[Azure]
  83. * {cloud}/ec-gcs-snapshotting.html[Google Cloud Storage]
  84. * {cloud}/ec-aws-custom-repository.html[AWS S3]
  85. * <<snapshots-source-only-repository,Source-only>>
  86. [discrete]
  87. [[self-managed-repo-types]]
  88. ==== Self-managed repository types
  89. If you manage your own {es} cluster, you can use the following built-in
  90. snapshot repository types:
  91. * <<repository-azure,Azure>>
  92. * <<repository-gcs,Google Cloud Storage>>
  93. * <<repository-s3,AWS S3>>
  94. * <<snapshots-filesystem-repository,Shared file system>>
  95. * <<snapshots-read-only-repository,Read-only URL>>
  96. * <<snapshots-source-only-repository,Source-only>>
  97. [[snapshots-repository-plugins]]
  98. Other repository types are available through official plugins:
  99. * {plugins}/repository-hdfs.html[Hadoop Distributed File System (HDFS)]
  100. You can also use alternative storage implementations with these repository
  101. types, as long as the alternative implementation is fully compatible. For
  102. instance, https://minio.io[MinIO] provides an alternative implementation of the
  103. AWS S3 API and you can use MinIO with the <<repository-s3,`s3` repository
  104. type>>.
  105. Note that some storage systems claim to be compatible with these repository
  106. types without emulating their behaviour in full. {es} requires full
  107. compatibility. In particular the alternative implementation must support the
  108. same set of API endpoints, return the same errors in case of failures, and
  109. offer equivalent consistency guarantees and performance even when accessed
  110. concurrently by multiple nodes. Incompatible error codes, consistency or
  111. performance may be particularly hard to track down since errors, consistency
  112. failures, and performance issues are usually rare and hard to reproduce.
  113. You can perform some basic checks of the suitability of your storage system
  114. using the <<repo-analysis-api>> API. If this API does not complete
  115. successfully, or indicates poor performance, then your storage system is not
  116. fully compatible and is therefore unsuitable for use as a snapshot repository.
  117. You will need to work with the supplier of your storage system to address any
  118. incompatibilities you encounter.
  119. [discrete]
  120. [[snapshots-repository-verification]]
  121. === Verify a repository
  122. When you register a snapshot repository, {es} automatically verifies that the
  123. repository is available and functional on all master and data nodes.
  124. To disable this verification, set the <<put-snapshot-repo-api,create snapshot
  125. repository API>>'s `verify` query parameter to `false`. You can't disable
  126. repository verification in {kib}.
  127. [source,console]
  128. ----
  129. PUT _snapshot/my_unverified_backup?verify=false
  130. {
  131. "type": "fs",
  132. "settings": {
  133. "location": "my_unverified_backup_location"
  134. }
  135. }
  136. ----
  137. // TEST[setup:setup-repository]
  138. // TEST[s/my_unverified_backup_location/my_repository/]
  139. If wanted, you can manually run the repository verification check. To verify a
  140. repository in {kib}, go to the **Repositories** list page and click the name of
  141. a repository. Then click **Verify repository**. You can also use the
  142. <<verify-snapshot-repo-api,verify snapshot repository API>>.
  143. [source,console]
  144. ----
  145. POST _snapshot/my_unverified_backup/_verify
  146. ----
  147. // TEST[continued]
  148. // TEST[s/my_unverified_backup_location/my_repository/]
  149. If successful, the request returns a list of nodes used to verify the
  150. repository. If verification fails, the request returns an error.
  151. You can test a repository more thoroughly using the
  152. <<repo-analysis-api,repository analysis API>>.
  153. [discrete]
  154. [[snapshots-repository-cleanup]]
  155. === Clean up a repository
  156. Repositories can over time accumulate data that is not referenced by any existing snapshot. This is a result of the data safety guarantees
  157. the snapshot functionality provides in failure scenarios during snapshot creation and the decentralized nature of the snapshot creation
  158. process. This unreferenced data does in no way negatively impact the performance or safety of a snapshot repository but leads to higher
  159. than necessary storage use. To remove this unreferenced data, you can run a cleanup operation on the repository. This will
  160. trigger a complete accounting of the repository's contents and delete any unreferenced data.
  161. To run the repository cleanup operation in {kib}, go to the **Repositories**
  162. list page and click the name of a repository. Then click **Clean up
  163. repository**.
  164. You can also use the <<clean-up-snapshot-repo-api,clean up snapshot repository
  165. API>>.
  166. [source,console]
  167. ----
  168. POST _snapshot/my_repository/_cleanup
  169. ----
  170. // TEST[setup:setup-snapshots]
  171. The API returns:
  172. [source,console-result]
  173. ----
  174. {
  175. "results": {
  176. "deleted_bytes": 20,
  177. "deleted_blobs": 5
  178. }
  179. }
  180. ----
  181. // TESTRESPONSE[s/"deleted_bytes": 20/"deleted_bytes": $body.results.deleted_bytes/]
  182. // TESTRESPONSE[s/"deleted_blobs": 5/"deleted_blobs": $body.results.deleted_bytes/]
  183. Depending on the concrete repository implementation the numbers shown for bytes free as well as the number of blobs removed will either
  184. be an approximation or an exact result. Any non-zero value for the number of blobs removed implies that unreferenced blobs were found and
  185. subsequently cleaned up.
  186. Please note that most of the cleanup operations executed by this endpoint are automatically executed when deleting any snapshot from a
  187. repository. If you regularly delete snapshots, you will in most cases not get any or only minor space savings from using this functionality
  188. and should lower your frequency of invoking it accordingly.
  189. [discrete]
  190. [[snapshots-repository-backup]]
  191. === Back up a repository
  192. You may wish to make an independent backup of your repository, for instance so
  193. that you have an archive copy of its contents that you can use to recreate the
  194. repository in its current state at a later date.
  195. You must ensure that {es} does not write to the repository while you are taking
  196. the backup of its contents. You can do this by unregistering it, or registering
  197. it with `readonly: true`, on all your clusters. If {es} writes any data to the
  198. repository during the backup then the contents of the backup may not be
  199. consistent and it may not be possible to recover any data from it in future.
  200. Alternatively, if your repository supports it, you may take an atomic snapshot
  201. of the underlying filesystem and then take a backup of this filesystem
  202. snapshot. It is very important that the filesystem snapshot is taken
  203. atomically.
  204. WARNING: You cannot use filesystem snapshots of individual nodes as a backup
  205. mechanism. You must use the {es} snapshot and restore feature to copy the
  206. cluster contents to a separate repository. Then, if desired, you can take a
  207. filesystem snapshot of this repository.
  208. When restoring a repository from a backup, you must not register the repository
  209. with {es} until the repository contents are fully restored. If you alter the
  210. contents of a repository while it is registered with {es} then the repository
  211. may become unreadable or may silently lose some of its contents.
  212. include::repository-azure.asciidoc[]
  213. include::repository-gcs.asciidoc[]
  214. include::repository-s3.asciidoc[]
  215. include::repository-shared-file-system.asciidoc[]
  216. include::repository-read-only-url.asciidoc[]
  217. include::repository-source-only.asciidoc[]