123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100 |
- [[searchable-snapshots]]
- == {search-snaps-cap}
- beta::[]
- {search-snaps-cap} let you reduce your operating costs by using
- <<snapshot-restore, snapshots>> for resiliency rather than maintaining
- <<scalability,replica shards>> within a cluster. When you mount an index from a
- snapshot as a {search-snap}, {es} copies the index shards to local storage
- within the cluster. This ensures that search performance is comparable to
- searching any other index, and minimizes the need to access the snapshot
- repository. Should a node fail, shards of a {search-snap} index are
- automatically recovered from the snapshot repository.
- This can result in significant cost savings for less frequently searched data.
- With {search-snaps}, you no longer need an extra index shard copy to avoid data
- loss, potentially halving the node local storage capacity necessary for
- searching that data. Because {search-snaps} rely on the same snapshot mechanism
- you use for backups, they have a minimal impact on your snapshot repository
- storage costs.
- [discrete]
- [[using-searchable-snapshots]]
- === Using {search-snaps}
- Searching a {search-snap} index is the same as searching any other index.
- Search performance is comparable to regular indices because the shard data is
- copied onto nodes in the cluster when the {search-snap} is mounted.
- By default, {search-snap} indices have no replicas. The underlying snapshot
- provides resilience and the query volume is expected to be low enough that a
- single shard copy will be sufficient. However, if you need to support a higher
- query volume, you can add replicas by adjusting the `index.number_of_replicas`
- index setting.
- If a node fails and {search-snap} shards need to be restored from the snapshot,
- there is a brief window of time while {es} allocates the shards to other nodes
- where the cluster health will not be `green`. Searches that hit these shards
- will fail or return partial results until they are reallocated.
- You typically manage {search-snaps} through {ilm-init}. The
- <<ilm-searchable-snapshot, searchable snapshots>> action automatically converts
- an index to a {search-snap} when it reaches the `cold` phase. You can also make
- indices in existing snapshots searchable by manually mounting them as
- {search-snaps} with the <<searchable-snapshots-api-mount-snapshot, mount
- snapshot>> API.
- To mount an index from a snapshot that contains multiple indices, we recommend
- creating a <<clone-snapshot-api, clone>> of the snapshot that contains only the
- index you want to search, and mounting the clone. You cannot delete a snapshot
- if it has any mounted indices, so creating a clone enables you to manage the
- lifecycle of the backup snapshot independently of any {search-snaps}.
- You can control the allocation of the shards of {search-snap} indices using the
- same mechanisms as for regular indices. For example, you could use
- <<shard-allocation-filtering>> to restrict {search-snap} shards to a subset of
- your nodes.
- We recommend that you <<indices-forcemerge, force-merge>> indices to a single
- segment per shard before taking a snapshot that will be mounted as a
- {search-snap} index. Each read from a snapshot repository takes time and costs
- money, and the fewer segments there are the fewer reads are needed to restore
- the snapshot.
- [TIP]
- ====
- {search-snaps-cap} are ideal for managing a large archive of historical data.
- Historical information is typically searched less frequently than recent data
- and therefore may not need replicas for their performance benefits.
- For more complex or time-consuming searches, you can use <<async-search>> with
- {search-snaps}.
- ====
- [discrete]
- [[how-searchable-snapshots-work]]
- === How {search-snaps} work
- When an index is mounted from a snapshot, {es} allocates its shards to data
- nodes within the cluster. The data nodes then automatically restore the shard
- data from the repository onto local storage. Once the restore process
- completes, these shards respond to searches using the data held in local
- storage and do not need to access the repository. This avoids incurring the
- cost or performance penalty associated with reading data from the repository.
- If a node holding one of these shards fails, {es} automatically allocates it to
- another node, and that node restores the shard data from the repository. No
- replicas are needed, and no complicated monitoring or orchestration is
- necessary to restore lost shards.
- {es} restores {search-snap} shards in the background and you can search them
- even if they have not been fully restored. If a search hits a {search-snap}
- shard before it has been fully restored, {es} eagerly retrieves the data needed
- for the search. If a shard is freshly allocated to a node and still warming up,
- some searches will be slower. However, searches typically access a very small
- fraction of the total shard data so the performance penalty is typically small.
- Replicas of {search-snaps} shards are restored by copying data from the
- snapshot repository. In contrast, replicas of regular indices are restored by
- copying data from the primary.
|