take-snapshot.asciidoc 7.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162
  1. [[snapshots-take-snapshot]]
  2. == Create a snapshot
  3. A repository can contain multiple snapshots of the same cluster. Snapshots are identified by unique names within the
  4. cluster.
  5. Use the <<put-snapshot-repo-api,create or update snapshot repository API>> to
  6. register or update a snapshot repository, and then use the
  7. <<create-snapshot-api,create snapshot API>> to create a snapshot in a
  8. repository.
  9. The following request creates a snapshot with the name `snapshot_1` in the repository `my_backup`:
  10. ////
  11. [source,console]
  12. -----------------------------------
  13. PUT /_snapshot/my_backup
  14. {
  15. "type": "fs",
  16. "settings": {
  17. "location": "my_backup_location"
  18. }
  19. }
  20. -----------------------------------
  21. // TESTSETUP
  22. ////
  23. [source,console]
  24. -----------------------------------
  25. PUT /_snapshot/my_backup/snapshot_1?wait_for_completion=true
  26. -----------------------------------
  27. The `wait_for_completion` parameter specifies whether or not the request should return immediately after snapshot
  28. initialization (default) or wait for snapshot completion. During snapshot initialization, information about all
  29. previous snapshots is loaded into memory, which means that in large repositories it may take several seconds (or
  30. even minutes) for this request to return even if the `wait_for_completion` parameter is set to `false`.
  31. By default, a snapshot backs up all data streams and open indices in the cluster. You can change this behavior by
  32. specifying the list of data streams and indices in the body of the snapshot request:
  33. [source,console]
  34. -----------------------------------
  35. PUT /_snapshot/my_backup/snapshot_2?wait_for_completion=true
  36. {
  37. "indices": "data_stream_1,index_1,index_2",
  38. "ignore_unavailable": true,
  39. "include_global_state": false,
  40. "metadata": {
  41. "taken_by": "kimchy",
  42. "taken_because": "backup before upgrading"
  43. }
  44. }
  45. -----------------------------------
  46. // TEST[skip:cannot complete subsequent snapshot]
  47. Use the `indices` parameter to list the data streams and indices that should be included in the snapshot. This parameter supports
  48. <<api-multi-index,multi-target syntax>>, although the options that control the behavior of multi-index syntax
  49. must be supplied in the body of the request, rather than as request parameters.
  50. Data stream backups include the stream's backing indices and metadata, such as
  51. the current <<data-streams-generation,generation>> and timestamp field.
  52. You can also choose to include only specific backing indices in a snapshot.
  53. However, these backups do not include the associated data stream's
  54. metadata or its other backing indices.
  55. Snapshots can also include a data stream but exclude specific backing indices.
  56. When you restore the data stream, it will contain only backing indices present
  57. in the snapshot. If the stream's original write index is not in the snapshot,
  58. the most recent backing index from the snapshot becomes the stream's write index.
  59. [discrete]
  60. [[create-snapshot-process-details]]
  61. === Snapshot process details
  62. The snapshot process works by taking a byte-for-byte copy of the files that
  63. make up each index or data stream and placing these copies in the repository.
  64. These files are mostly written by Lucene and contain a compact representation
  65. of all the data in each index or data stream in a form that is designed to be
  66. searched efficiently. This means that when you restore an index or data stream
  67. from a snapshot there is no need to rebuild these search-focused data
  68. structures. It also means that you can use <<searchable-snapshots>> to directly
  69. search the data in the repository.
  70. The snapshot process is incremental: {es} compares the files that make up the
  71. index or data stream against the files that already exist in the repository
  72. and only copies files that were created or changed
  73. since the last snapshot. Snapshots are very space-efficient since they reuse
  74. any files copied to the repository by earlier snapshots.
  75. Snapshotting does not interfere with ongoing indexing or searching operations.
  76. A snapshot captures a view of each shard at some point in time between the
  77. start and end of the snapshotting process. The snapshot may not include
  78. documents added to a data stream or index after the snapshot process starts.
  79. You can start multiple snapshot operations at the same time. Concurrent snapshot
  80. operations are limited by the `snapshot.max_concurrent_operations` cluster
  81. setting, which defaults to `1000`. This limit applies in total to all ongoing snapshot
  82. creation, cloning, and deletion operations. {es} will reject any operations
  83. that would exceed this limit.
  84. The snapshot process starts immediately for the primary shards that have been
  85. started and are not relocating at the moment. {es} waits for relocation or
  86. initialization of shards to complete before snapshotting them.
  87. Besides creating a copy of each data stream and index, the snapshot process can
  88. also store global cluster metadata, which includes persistent cluster settings,
  89. templates, and data stored in system indices, such as Watches and task records,
  90. regardless of whether those system indices are named in the `indices` section
  91. of the request. You can also use the create snapshot
  92. API's <<create-snapshot-api-feature-states,`feature_states`>> parameter to
  93. include only a subset of system indices in the snapshot. Snapshots do not
  94. store transient settings or registered snapshot repositories.
  95. While a snapshot of a particular shard is being created, the shard cannot be
  96. moved to another node, which can interfere with rebalancing and allocation
  97. filtering. {es} can only move the shard to another node (according to the current
  98. allocation filtering settings and rebalancing algorithm) after the snapshot
  99. process is finished.
  100. You can use the <<get-snapshot-api,Get snapshot API>> to retrieve information
  101. about ongoing and completed snapshots. See
  102. <<snapshots-monitor-snapshot-restore,Monitor snapshot and restore progress>>.
  103. [discrete]
  104. [[create-snapshot-options]]
  105. === Options for creating a snapshot
  106. The create snapshot request supports the
  107. `ignore_unavailable` option. Setting it to `true` will cause data streams and indices that do not exist to be ignored during snapshot
  108. creation. By default, when the `ignore_unavailable` option is not set and a data stream or index is missing, the snapshot request will fail.
  109. By setting `include_global_state` to `false` it's possible to prevent the cluster global state to be stored as part of
  110. the snapshot.
  111. IMPORTANT: The global cluster state includes the cluster's index
  112. templates, such as those <<create-index-template,matching a data
  113. stream>>. If your snapshot includes data streams, we recommend storing the
  114. global state as part of the snapshot. This lets you later restored any
  115. templates required for a data stream.
  116. By default, the entire snapshot will fail if one or more indices participating in the snapshot do not have
  117. all primary shards available. You can change this behaviour by setting `partial` to `true`. The `expand_wildcards`
  118. option can be used to control whether hidden and closed indices will be included in the snapshot, and defaults to `all`.
  119. Use the `metadata` field to attach arbitrary metadata to the snapshot,
  120. such as who took the snapshot,
  121. why it was taken, or any other data that might be useful.
  122. Snapshot names can be automatically derived using <<date-math-index-names,date math expressions>>, similarly as when creating
  123. new indices. Special characters must be URI encoded.
  124. For example, use the <<create-snapshot-api,create snapshot API>> to create
  125. a snapshot with the current day in the name, such as `snapshot-2020.07.11`:
  126. [source,console]
  127. -----------------------------------
  128. PUT /_snapshot/my_backup/<snapshot-{now/d}>
  129. PUT /_snapshot/my_backup/%3Csnapshot-%7Bnow%2Fd%7D%3E
  130. -----------------------------------
  131. // TEST[continued]
  132. NOTE: You can also create snapshots that are copies of part of an existing snapshot using the <<clone-snapshot-api,clone snapshot API>>.