|
@@ -1,22 +1,47 @@
|
|
|
[[modules-snapshots]]
|
|
|
== Snapshot And Restore
|
|
|
|
|
|
-You can store snapshots of individual indices or an entire cluster in
|
|
|
-a remote repository like a shared file system, S3, or HDFS. These snapshots
|
|
|
-are great for backups because they can be restored relatively quickly. However,
|
|
|
-snapshots can only be restored to versions of Elasticsearch that can read the
|
|
|
-indices:
|
|
|
+A snapshot is a backup taken from a running Elasticsearch cluster. You can take
|
|
|
+a snapshot of individual indices or of the entire cluster and store it in a
|
|
|
+repository on a shared filesystem, and there are plugins that support remote
|
|
|
+repositories on S3, HDFS, Azure, Google Cloud Storage and more.
|
|
|
+
|
|
|
+Snapshots are taken incrementally. This means that when creating a snapshot of
|
|
|
+an index Elasticsearch will avoid copying any data that is already stored in
|
|
|
+the repository as part of an earlier snapshot of the same index. Therefore it
|
|
|
+can be efficient to take snapshots of your cluster quite frequently.
|
|
|
+
|
|
|
+Snapshots can be restored into a running cluster via the restore API. When
|
|
|
+restoring an index it is possible to alter the name of the restored index as
|
|
|
+well as some of its settings, allowing a great deal of flexibility in how the
|
|
|
+snapshot and restore functionality can be used.
|
|
|
+
|
|
|
+WARNING: It is not possible to back up an Elasticsearch cluster simply by
|
|
|
+taking a copy of the data directories of all of its nodes. Elasticsearch may be
|
|
|
+making changes to the contents of its data directories while it is running, and
|
|
|
+this means that copying its data directories cannot be expected to capture a
|
|
|
+consistent picture of their contents. Attempts to restore a cluster from such a
|
|
|
+backup may fail, reporting corruption and/or missing files, or may appear to
|
|
|
+have succeeded having silently lost some of its data. The only reliable way to
|
|
|
+back up a cluster is by using the snapshot and restore functionality.
|
|
|
+
|
|
|
+[float]
|
|
|
+=== Version compatibility
|
|
|
+
|
|
|
+A snapshot contains a copy of the on-disk data structures that make up an
|
|
|
+index. This means that snapshots can only be restored to versions of
|
|
|
+Elasticsearch that can read the indices:
|
|
|
|
|
|
* A snapshot of an index created in 5.x can be restored to 6.x.
|
|
|
* A snapshot of an index created in 2.x can be restored to 5.x.
|
|
|
* A snapshot of an index created in 1.x can be restored to 2.x.
|
|
|
|
|
|
-Conversely, snapshots of indices created in 1.x **cannot** be restored to
|
|
|
-5.x or 6.x, and snapshots of indices created in 2.x **cannot** be restored
|
|
|
-to 6.x.
|
|
|
+Conversely, snapshots of indices created in 1.x **cannot** be restored to 5.x
|
|
|
+or 6.x, and snapshots of indices created in 2.x **cannot** be restored to 6.x.
|
|
|
|
|
|
-Snapshots are incremental and can contain indices created in various
|
|
|
-versions of Elasticsearch. If any indices in a snapshot were created in an
|
|
|
+Each snapshot can contain indices created in various versions of Elasticsearch,
|
|
|
+and when restoring a snapshot it must be possible to restore all of the indices
|
|
|
+into the target cluster. If any indices in a snapshot were created in an
|
|
|
incompatible version, you will not be able restore the snapshot.
|
|
|
|
|
|
IMPORTANT: When backing up your data prior to an upgrade, keep in mind that you
|
|
@@ -28,8 +53,8 @@ that is incompatible with the version of the cluster you are currently running,
|
|
|
you can restore it on the latest compatible version and use
|
|
|
<<reindex-from-remote,reindex-from-remote>> to rebuild the index on the current
|
|
|
version. Reindexing from remote is only possible if the original index has
|
|
|
-source enabled. Retrieving and reindexing the data can take significantly longer
|
|
|
-than simply restoring a snapshot. If you have a large amount of data, we
|
|
|
+source enabled. Retrieving and reindexing the data can take significantly
|
|
|
+longer than simply restoring a snapshot. If you have a large amount of data, we
|
|
|
recommend testing the reindex from remote process with a subset of your data to
|
|
|
understand the time requirements before proceeding.
|
|
|
|