index.asciidoc 4.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
  1. [[searchable-snapshots]]
  2. == {search-snaps-cap}
  3. beta::[]
  4. {search-snaps-cap} let you reduce your operating costs by using
  5. <<snapshot-restore, snapshots>> for resiliency rather than maintaining
  6. <<scalability,replica shards>> within a cluster. When you mount an index from a
  7. snapshot as a {search-snap}, {es} copies the index shards to local storage
  8. within the cluster. This ensures that search performance is comparable to
  9. searching any other index, and minimizes the need to access the snapshot
  10. repository. Should a node fail, shards of a {search-snap} index are
  11. automatically recovered from the snapshot repository.
  12. This can result in significant cost savings for less frequently searched data.
  13. With {search-snaps}, you no longer need an extra index shard copy to avoid data
  14. loss, potentially halving the node local storage capacity necessary for
  15. searching that data. Because {search-snaps} rely on the same snapshot mechanism
  16. you use for backups, they have a minimal impact on your snapshot repository
  17. storage costs.
  18. [discrete]
  19. [[using-searchable-snapshots]]
  20. === Using {search-snaps}
  21. Searching a {search-snap} index is the same as searching any other index.
  22. Search performance is comparable to regular indices because the shard data is
  23. copied onto nodes in the cluster when the {search-snap} is mounted.
  24. By default, {search-snap} indices have no replicas. The underlying snapshot
  25. provides resilience and the query volume is expected to be low enough that a
  26. single shard copy will be sufficient. However, if you need to support a higher
  27. query volume, you can add replicas by adjusting the `index.number_of_replicas`
  28. index setting.
  29. If a node fails and {search-snap} shards need to be restored from the snapshot,
  30. there is a brief window of time while {es} allocates the shards to other nodes
  31. where the cluster health will not be `green`. Searches that hit these shards
  32. will fail or return partial results until they are reallocated.
  33. You typically manage {search-snaps} through {ilm-init}. The
  34. <<ilm-searchable-snapshot, searchable snapshots>> action automatically converts
  35. an index to a {search-snap} when it reaches the `cold` phase. You can also make
  36. indices in existing snapshots searchable by manually mounting them as
  37. {search-snaps} with the <<searchable-snapshots-api-mount-snapshot, mount
  38. snapshot>> API.
  39. To mount an index from a snapshot that contains multiple indices, we recommend
  40. creating a <<clone-snapshot-api, clone>> of the snapshot that contains only the
  41. index you want to search, and mounting the clone. You cannot delete a snapshot
  42. if it has any mounted indices, so creating a clone enables you to manage the
  43. lifecycle of the backup snapshot independently of any {search-snaps}.
  44. You can control the allocation of the shards of {search-snap} indices using the
  45. same mechanisms as for regular indices. For example, you could use
  46. <<shard-allocation-filtering>> to restrict {search-snap} shards to a subset of
  47. your nodes.
  48. We recommend that you <<indices-forcemerge, force-merge>> indices to a single
  49. segment per shard before taking a snapshot that will be mounted as a
  50. {search-snap} index. Each read from a snapshot repository takes time and costs
  51. money, and the fewer segments there are the fewer reads are needed to restore
  52. the snapshot.
  53. [TIP]
  54. ====
  55. {search-snaps-cap} are ideal for managing a large archive of historical data.
  56. Historical information is typically searched less frequently than recent data
  57. and therefore may not need replicas for their performance benefits.
  58. For more complex or time-consuming searches, you can use <<async-search>> with
  59. {search-snaps}.
  60. ====
  61. [discrete]
  62. [[how-searchable-snapshots-work]]
  63. === How {search-snaps} work
  64. When an index is mounted from a snapshot, {es} allocates its shards to data
  65. nodes within the cluster. The data nodes then automatically restore the shard
  66. data from the repository onto local storage. Once the restore process
  67. completes, these shards respond to searches using the data held in local
  68. storage and do not need to access the repository. This avoids incurring the
  69. cost or performance penalty associated with reading data from the repository.
  70. If a node holding one of these shards fails, {es} automatically allocates it to
  71. another node, and that node restores the shard data from the repository. No
  72. replicas are needed, and no complicated monitoring or orchestration is
  73. necessary to restore lost shards.
  74. {es} restores {search-snap} shards in the background and you can search them
  75. even if they have not been fully restored. If a search hits a {search-snap}
  76. shard before it has been fully restored, {es} eagerly retrieves the data needed
  77. for the search. If a shard is freshly allocated to a node and still warming up,
  78. some searches will be slower. However, searches typically access a very small
  79. fraction of the total shard data so the performance penalty is typically small.
  80. Replicas of {search-snaps} shards are restored by copying data from the
  81. snapshot repository. In contrast, replicas of regular indices are restored by
  82. copying data from the primary.