123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205 |
- [[snapshots-take-snapshot]]
- == Take a snapshot of one or more indices
- ++++
- <titleabbrev>Take a snapshot</titleabbrev>
- ++++
- A repository can contain multiple snapshots of the same cluster. Snapshots are identified by unique names within the
- cluster. A snapshot with the name `snapshot_1` in the repository `my_backup` can be created by executing the following
- command:
- ////
- [source,console]
- -----------------------------------
- PUT /_snapshot/my_backup
- {
- "type": "fs",
- "settings": {
- "location": "my_backup_location"
- }
- }
- -----------------------------------
- // TESTSETUP
- ////
- [source,console]
- -----------------------------------
- PUT /_snapshot/my_backup/snapshot_1?wait_for_completion=true
- -----------------------------------
- The `wait_for_completion` parameter specifies whether or not the request should return immediately after snapshot
- initialization (default) or wait for snapshot completion. During snapshot initialization, information about all
- previous snapshots is loaded into the memory, which means that in large repositories it may take several seconds (or
- even minutes) for this command to return even if the `wait_for_completion` parameter is set to `false`.
- By default a snapshot of all open and started indices in the cluster is created. This behavior can be changed by
- specifying the list of indices in the body of the snapshot request.
- [source,console]
- -----------------------------------
- PUT /_snapshot/my_backup/snapshot_2?wait_for_completion=true
- {
- "indices": "index_1,index_2",
- "ignore_unavailable": true,
- "include_global_state": false,
- "metadata": {
- "taken_by": "kimchy",
- "taken_because": "backup before upgrading"
- }
- }
- -----------------------------------
- // TEST[skip:cannot complete subsequent snapshot]
- The list of indices that should be included into the snapshot can be specified using the `indices` parameter that
- supports <<multi-index,multi-target syntax>>, although the options which control the behavior of multi index syntax
- must be supplied in the body of the request, rather than as request parameters. The snapshot request also supports the
- `ignore_unavailable` option. Setting it to `true` will cause indices that do not exist to be ignored during snapshot
- creation. By default, when `ignore_unavailable` option is not set and an index is missing the snapshot request will fail.
- By setting `include_global_state` to false it's possible to prevent the cluster global state to be stored as part of
- the snapshot. By default, the entire snapshot will fail if one or more indices participating in the snapshot don't have
- all primary shards available. This behaviour can be changed by setting `partial` to `true`. The `expand_wildcards`
- option can be used to control whether hidden and closed indices will be included in the snapshot, and defaults to `all`.
- The `metadata` field can be used to attach arbitrary metadata to the snapshot. This may be a record of who took the snapshot,
- why it was taken, or any other data that might be useful.
- Snapshot names can be automatically derived using <<date-math-index-names,date math expressions>>, similarly as when creating
- new indices. Note that special characters need to be URI encoded.
- For example, creating a snapshot with the current day in the name, like `snapshot-2018.05.11`, can be achieved with
- the following command:
- [source,console]
- -----------------------------------
- # PUT /_snapshot/my_backup/<snapshot-{now/d}>
- PUT /_snapshot/my_backup/%3Csnapshot-%7Bnow%2Fd%7D%3E
- -----------------------------------
- // TEST[continued]
- The index snapshot process is incremental. In the process of making the index snapshot Elasticsearch analyses
- the list of the index files that are already stored in the repository and copies only files that were created or
- changed since the last snapshot. That allows multiple snapshots to be preserved in the repository in a compact form.
- Snapshotting process is executed in non-blocking fashion. All indexing and searching operation can continue to be
- executed against the index that is being snapshotted. However, a snapshot represents the point-in-time view of the index
- at the moment when snapshot was created, so no records that were added to the index after the snapshot process was started
- will be present in the snapshot. The snapshot process starts immediately for the primary shards that has been started
- and are not relocating at the moment. Before version 1.2.0, the snapshot operation fails if the cluster has any relocating or
- initializing primaries of indices participating in the snapshot. Starting with version 1.2.0, Elasticsearch waits for
- relocation or initialization of shards to complete before snapshotting them.
- Besides creating a copy of each index the snapshot process can also store global cluster metadata, which includes persistent
- cluster settings and templates. The transient settings and registered snapshot repositories are not stored as part of
- the snapshot.
- Only one snapshot process can be executed in the cluster at any time. While snapshot of a particular shard is being
- created this shard cannot be moved to another node, which can interfere with rebalancing process and allocation
- filtering. Elasticsearch will only be able to move a shard to another node (according to the current allocation
- filtering settings and rebalancing algorithm) once the snapshot is finished.
- Once a snapshot is created information about this snapshot can be obtained using the following command:
- [source,console]
- -----------------------------------
- GET /_snapshot/my_backup/snapshot_1
- -----------------------------------
- // TEST[continued]
- This command returns basic information about the snapshot including start and end time, version of
- Elasticsearch that created the snapshot, the list of included indices, the current state of the
- snapshot and the list of failures that occurred during the snapshot. The snapshot `state` can be
- [horizontal]
- `IN_PROGRESS`::
- The snapshot is currently running.
- `SUCCESS`::
- The snapshot finished and all shards were stored successfully.
- `FAILED`::
- The snapshot finished with an error and failed to store any data.
- `PARTIAL`::
- The global cluster state was stored, but data of at least one shard wasn't stored successfully.
- The `failure` section in this case should contain more detailed information about shards
- that were not processed correctly.
- `INCOMPATIBLE`::
- The snapshot was created with an old version of Elasticsearch and therefore is incompatible with
- the current version of the cluster.
- Similar as for repositories, information about multiple snapshots can be queried in one go, supporting wildcards as well:
- [source,console]
- -----------------------------------
- GET /_snapshot/my_backup/snapshot_*,some_other_snapshot
- -----------------------------------
- // TEST[continued]
- All snapshots currently stored in the repository can be listed using the following command:
- [source,console]
- -----------------------------------
- GET /_snapshot/my_backup/_all
- -----------------------------------
- // TEST[continued]
- The command fails if some of the snapshots are unavailable. The boolean parameter `ignore_unavailable` can be used to
- return all snapshots that are currently available.
- Getting all snapshots in the repository can be costly on cloud-based repositories,
- both from a cost and performance perspective. If the only information required is
- the snapshot names/uuids in the repository and the indices in each snapshot, then
- the optional boolean parameter `verbose` can be set to `false` to execute a more
- performant and cost-effective retrieval of the snapshots in the repository. Note
- that setting `verbose` to `false` will omit all other information about the snapshot
- such as status information, the number of snapshotted shards, etc. The default
- value of the `verbose` parameter is `true`.
- It is also possible to retrieve snapshots from multiple repositories in one go, for example:
- [source,console]
- -----------------------------------
- GET /_snapshot/_all
- GET /_snapshot/my_backup,my_fs_backup
- GET /_snapshot/my*/snap*
- -----------------------------------
- // TEST[skip:no my_fs_backup]
- A currently running snapshot can be retrieved using the following command:
- [source,console]
- -----------------------------------
- GET /_snapshot/my_backup/_current
- -----------------------------------
- // TEST[continued]
- A snapshot can be deleted from the repository using the following command:
- [source,console]
- -----------------------------------
- DELETE /_snapshot/my_backup/snapshot_2
- -----------------------------------
- // TEST[continued]
- When a snapshot is deleted from a repository, Elasticsearch deletes all files that are associated with the deleted
- snapshot and not used by any other snapshots. If the deleted snapshot operation is executed while the snapshot is being
- created the snapshotting process will be aborted and all files created as part of the snapshotting process will be
- cleaned. Therefore, the delete snapshot operation can be used to cancel long running snapshot operations that were
- started by mistake.
- It is also possible to delete multiple snapshots from a repository in one go, for example:
- [source,console]
- -----------------------------------
- DELETE /_snapshot/my_backup/my_backup,my_fs_backup
- DELETE /_snapshot/my_backup/snap*
- -----------------------------------
- // TEST[skip:no my_fs_backup]
|