getting-started-slm.asciidoc 9.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[getting-started-snapshot-lifecycle-management]]
  4. === Tutorial: Automate backups with {slm-init}
  5. This tutorial demonstrates how to automate daily backups of {es} data streams and indices using an {slm-init} policy.
  6. The policy takes <<modules-snapshots, snapshots>> of all data streams and indices in the cluster
  7. and stores them in a local repository.
  8. It also defines a retention policy and automatically deletes snapshots
  9. when they are no longer needed.
  10. To manage snapshots with {slm-init}, you:
  11. . <<slm-gs-register-repository, Register a repository>>.
  12. . <<slm-gs-create-policy, Create an {slm-init} policy>>.
  13. To test the policy, you can manually trigger it to take an initial snapshot.
  14. [discrete]
  15. [[slm-gs-register-repository]]
  16. ==== Register a repository
  17. To use {slm-init}, you must have a snapshot repository configured.
  18. The repository can be local (shared filesystem) or remote (cloud storage).
  19. Remote repositories can reside on S3, HDFS, Azure, Google Cloud Storage,
  20. or any other platform supported by a {plugins}/repository.html[repository plugin].
  21. Remote repositories are generally used for production deployments.
  22. For this tutorial, you can register a local repository from
  23. {kibana-ref}/snapshot-repositories.html[{kib} Management]
  24. or use the create or update repository API:
  25. [source,console]
  26. -----------------------------------
  27. PUT /_snapshot/my_repository
  28. {
  29. "type": "fs",
  30. "settings": {
  31. "location": "my_backup_location"
  32. }
  33. }
  34. -----------------------------------
  35. [discrete]
  36. [[slm-gs-create-policy]]
  37. ==== Set up a snapshot policy
  38. Once you have a repository in place,
  39. you can define an {slm-init} policy to take snapshots automatically.
  40. The policy defines when to take snapshots, which data streams or indices should be included,
  41. and what to name the snapshots.
  42. A policy can also specify a <<slm-retention,retention policy>> and
  43. automatically delete snapshots when they are no longer needed.
  44. TIP: Don't be afraid to configure a policy that takes frequent snapshots.
  45. Snapshots are incremental and make efficient use of storage.
  46. You can define and manage policies through {kib} Management or with the create
  47. or update policy API.
  48. For example, you could define a `nightly-snapshots` policy
  49. to back up all of your data streams and indices daily at 1:30AM UTC.
  50. A create or update policy request defines the policy configuration in JSON:
  51. [source,console]
  52. --------------------------------------------------
  53. PUT /_slm/policy/nightly-snapshots
  54. {
  55. "schedule": "0 30 2 * * ?", <1>
  56. "name": "<nightly-snap-{now/d}>", <2>
  57. "repository": "my_repository", <3>
  58. "config": { <4>
  59. "indices": ["*"] <5>
  60. },
  61. "retention": { <6>
  62. "expire_after": "30d", <7>
  63. "min_count": 5, <8>
  64. "max_count": 50 <9>
  65. }
  66. }
  67. --------------------------------------------------
  68. // TEST[continued]
  69. <1> When the snapshot should be taken in
  70. <<schedule-cron,Cron syntax>>: daily at 1:30AM UTC
  71. <2> How to name the snapshot: use
  72. <<date-math-index-names,date math>> to include the current date in the snapshot name
  73. <3> Where to store the snapshot
  74. <4> The configuration to be used for the snapshot requests (see below)
  75. <5> Which data streams or indices to include in the snapshot: all data streams and indices
  76. <6> Optional retention policy: keep snapshots for 30 days,
  77. retaining at least 5 and no more than 50 snapshots regardless of age
  78. You can specify additional snapshot configuration options to customize how snapshots are taken.
  79. For example, you could configure the policy to fail the snapshot
  80. if one of the specified data streams or indices is missing.
  81. For more information about snapshot options, see <<snapshots-take-snapshot,snapshot requests>>.
  82. [discrete]
  83. [[slm-gs-test-policy]]
  84. ==== Test the snapshot policy
  85. A snapshot taken by {slm-init} is just like any other snapshot.
  86. You can view information about snapshots in {kib} Management or
  87. get info with the <<snapshots-monitor-snapshot-restore, snapshot APIs>>.
  88. In addition, {slm-init} keeps track of policy successes and failures so you
  89. have insight into how the policy is working. If the policy has executed at
  90. least once, the <<slm-api-get-policy, get policy>> API returns additional metadata
  91. that shows if the snapshot succeeded.
  92. You can manually execute a snapshot policy to take a snapshot immediately.
  93. This is useful for taking snapshots before making a configuration change,
  94. upgrading, or to test a new policy.
  95. Manually executing a policy does not affect its configured schedule.
  96. Instead of waiting for the policy to run, tell {slm-init} to take a snapshot
  97. using the configuration right now instead of waiting for 1:30 a.m..
  98. [source,console]
  99. --------------------------------------------------
  100. POST /_slm/policy/nightly-snapshots/_execute
  101. --------------------------------------------------
  102. // TEST[skip:we can't easily handle snapshots from docs tests]
  103. After forcing the `nightly-snapshots` policy to run,
  104. you can retrieve the policy to get success or failure information.
  105. [source,console]
  106. --------------------------------------------------
  107. GET /_slm/policy/nightly-snapshots?human
  108. --------------------------------------------------
  109. // TEST[continued]
  110. Only the most recent success and failure are returned,
  111. but all policy executions are recorded in the `.slm-history*` indices.
  112. The response also shows when the policy is scheduled to execute next.
  113. NOTE: The response shows if the policy succeeded in _initiating_ a snapshot.
  114. However, that does not guarantee that the snapshot completed successfully.
  115. It is possible for the initiated snapshot to fail if, for example, the connection to a remote
  116. repository is lost while copying files.
  117. [source,console-result]
  118. --------------------------------------------------
  119. {
  120. "nightly-snapshots" : {
  121. "version": 1,
  122. "modified_date": "2019-04-23T01:30:00.000Z",
  123. "modified_date_millis": 1556048137314,
  124. "policy" : {
  125. "schedule": "0 30 1 * * ?",
  126. "name": "<nightly-snap-{now/d}>",
  127. "repository": "my_repository",
  128. "config": {
  129. "indices": ["*"],
  130. },
  131. "retention": {
  132. "expire_after": "30d",
  133. "min_count": 5,
  134. "max_count": 50
  135. }
  136. },
  137. "last_success": { <1>
  138. "snapshot_name": "nightly-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a", <2>
  139. "time_string": "2019-04-24T16:43:49.316Z",
  140. "time": 1556124229316
  141. } ,
  142. "last_failure": { <3>
  143. "snapshot_name": "nightly-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
  144. "time_string": "2019-04-02T01:30:00.000Z",
  145. "time": 1556042030000,
  146. "details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
  147. } ,
  148. "next_execution": "2019-04-24T01:30:00.000Z", <4>
  149. "next_execution_millis": 1556048160000
  150. }
  151. }
  152. --------------------------------------------------
  153. // TESTRESPONSE[skip:the presence of last_failure and last_success is asynchronous and will be present for users, but is untestable]
  154. <1> Information about the last time the policy successfully initated a snapshot
  155. <2> The name of the snapshot that was successfully initiated
  156. <3> Information about the last time the policy failed to initiate a snapshot
  157. <4> The next time the policy will execute