123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229 |
- [role="xpack"]
- [testenv="basic"]
- [[getting-started-snapshot-lifecycle-management]]
- == Getting started with snapshot lifecycle management
- Let's get started with snapshot lifecycle management (SLM) by working through a
- hands-on scenario. The goal of this example is to automatically back up {es}
- indices using the <<modules-snapshots,snapshots>> every day at a particular
- time. Once these snapshots have been created, they are kept for a configured
- amount of time and then deleted per a configured retention policy.
- [float]
- [[slm-and-security]]
- === Security and SLM
- Before starting, it's important to understand the privileges that are needed
- when configuring SLM if you are using the security plugin. There are two
- built-in cluster privileges that can be used to assist: `manage_slm` and
- `read_slm`. It's also good to note that the `cluster:admin/snapshot/*`
- permission allows taking and deleting snapshots even for indices the role may
- not have access to.
- An example of configuring an administrator role for SLM follows:
- [source,console]
- -----------------------------------
- POST /_security/role/slm-admin
- {
- "cluster": ["manage_slm", "cluster:admin/snapshot/*"],
- "indices": [
- {
- "names": [".slm-history-*"],
- "privileges": ["all"]
- }
- ]
- }
- -----------------------------------
- // TEST[skip:security is not enabled here]
- Or, for a read-only role that can retrieve policies (but not update, execute, or
- delete them), as well as only view the history index:
- [source,console]
- -----------------------------------
- POST /_security/role/slm-read-only
- {
- "cluster": ["read_slm"],
- "indices": [
- {
- "names": [".slm-history-*"],
- "privileges": ["read"]
- }
- ]
- }
- -----------------------------------
- // TEST[skip:security is not enabled here]
- [float]
- [[slm-gs-create-policy]]
- === Setting up a repository
- Before we can set up an SLM policy, we'll need to set up a
- <<snapshots-repositories,snapshot repository>> where the snapshots will be
- stored. Repositories can use {plugins}/repository.html[many different backends],
- including cloud storage providers. You'll probably want to use one of these in
- production, but for this example we'll use a shared file system repository:
- [source,console]
- -----------------------------------
- PUT /_snapshot/my_repository
- {
- "type": "fs",
- "settings": {
- "location": "my_backup_location"
- }
- }
- -----------------------------------
- [float]
- === Setting up a policy
- Now that we have a repository in place, we can create a policy to automatically
- take snapshots. Policies are written in JSON and will define when to take
- snapshots, what the snapshots should be named, and which indices should be
- included, among other things. We'll use the <<slm-api-put,Put Policy>> API
- to create the policy.
- When configurating a policy, retention can also optionally be configured. See
- the <<slm-retention,SLM retention>> documentation for the full documentation of
- how retention works.
- [source,console]
- --------------------------------------------------
- PUT /_slm/policy/nightly-snapshots
- {
- "schedule": "0 30 1 * * ?", <1>
- "name": "<nightly-snap-{now/d}>", <2>
- "repository": "my_repository", <3>
- "config": { <4>
- "indices": ["*"] <5>
- },
- "retention": { <6>
- "expire_after": "30d", <7>
- "min_count": 5, <8>
- "max_count": 50 <9>
- }
- }
- --------------------------------------------------
- // TEST[continued]
- <1> when the snapshot should be taken, using
- <<schedule-cron,Cron syntax>>, in this
- case at 1:30AM each day
- <2> whe name each snapshot should be given, using
- <<date-math-index-names,date math>> to include the current date in the name
- of the snapshot
- <3> the repository the snapshot should be stored in
- <4> the configuration to be used for the snapshot requests (see below)
- <5> which indices should be included in the snapshot, in this case, every index
- <6> Optional retention configuration
- <7> Keep snapshots for 30 days
- <8> Always keep at least 5 successful snapshots
- <9> Keep no more than 50 successful snapshots, even if they're less than 30 days old
- This policy will take a snapshot of every index each day at 1:30AM UTC.
- Snapshots are incremental, allowing frequent snapshots to be stored efficiently,
- so don't be afraid to configure a policy to take frequent snapshots.
- In addition to specifying the indices that should be included in the snapshot,
- the `config` field can be used to customize other aspects of the snapshot. You
- can use any option allowed in <<snapshots-take-snapshot,a regular snapshot
- request>>, so you can specify, for example, whether the snapshot should fail in
- special cases, such as if one of the specified indices cannot be found.
- [float]
- === Making sure the policy works
- While snapshots taken by SLM policies can be viewed through the standard snapshot
- API, SLM also keeps track of policy successes and failures in ways that are a bit
- easier to use to make sure the policy is working. Once a policy has executed at
- least once, when you view the policy using the <<slm-api-get,Get Policy API>>,
- some metadata will be returned indicating whether the snapshot was sucessfully
- initiated or not.
- Instead of waiting for our policy to run, let's tell SLM to take a snapshot
- as using the configuration from our policy right now instead of waiting for
- 1:30AM.
- [source,console]
- --------------------------------------------------
- POST /_slm/policy/nightly-snapshots/_execute
- --------------------------------------------------
- // TEST[skip:we can't easily handle snapshots from docs tests]
- This request will kick off a snapshot for our policy right now, regardless of
- the schedule in the policy. This is useful for taking snapshots before making
- a configuration change, upgrading, or for our purposes, making sure our policy
- is going to work successfully. The policy will continue to run on its configured
- schedule after this execution of the policy.
- [source,console]
- --------------------------------------------------
- GET /_slm/policy/nightly-snapshots?human
- --------------------------------------------------
- // TEST[continued]
- This request will return a response that includes the policy, as well as
- information about the last time the policy succeeded and failed, as well as the
- next time the policy will be executed.
- [source,console-result]
- --------------------------------------------------
- {
- "nightly-snapshots" : {
- "version": 1,
- "modified_date": "2019-04-23T01:30:00.000Z",
- "modified_date_millis": 1556048137314,
- "policy" : {
- "schedule": "0 30 1 * * ?",
- "name": "<nightly-snap-{now/d}>",
- "repository": "my_repository",
- "config": {
- "indices": ["*"],
- },
- "retention": {
- "expire_after": "30d",
- "min_count": 5,
- "max_count": 50
- }
- },
- "last_success": { <1>
- "snapshot_name": "nightly-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a", <2>
- "time_string": "2019-04-24T16:43:49.316Z",
- "time": 1556124229316
- } ,
- "last_failure": { <3>
- "snapshot_name": "nightly-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
- "time_string": "2019-04-02T01:30:00.000Z",
- "time": 1556042030000,
- "details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
- } ,
- "next_execution": "2019-04-24T01:30:00.000Z", <4>
- "next_execution_millis": 1556048160000
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[skip:the presence of last_failure and last_success is asynchronous and will be present for users, but is untestable]
- <1> information about the last time the policy successfully initated a snapshot
- <2> the name of the snapshot that was successfully initiated
- <3> information about the last time the policy failed to initiate a snapshot
- <4> the is the next time the policy will execute
- NOTE: This metadata only indicates whether the request to initiate the snapshot was
- made successfully or not - after the snapshot has been successfully started, it
- is possible for the snapshot to fail if, for example, the connection to a remote
- repository is lost while copying files.
- If you're following along, the returned SLM policy shouldn't have a `last_failure`
- field - it's included above only as an example. You should, however, see a
- `last_success` field and a snapshot name. If you do, you've successfully taken
- your first snapshot using SLM!
- While only the most recent sucess and failure are available through the Get Policy
- API, all policy executions are recorded to a history index, which may be queried
- by searching the index pattern `.slm-history*`.
- That's it! We have our first SLM policy set up to periodically take snapshots
- so that our backups are always up to date. You can read more details in the
- <<snapshot-lifecycle-management-api,SLM API documentation>> and the
- <<modules-snapshots,general snapshot documentation.>>
|