123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251 |
- [[searchable-snapshots]]
- == {search-snaps-cap}
- {search-snaps-cap} let you use <<snapshot-restore,snapshots>> to search
- infrequently accessed and read-only data in a very cost-effective fashion. The
- <<cold-tier,cold>> and <<frozen-tier,frozen>> data tiers use {search-snaps} to
- reduce your storage and operating costs.
- {search-snaps-cap} eliminate the need for <<scalability,replica shards>>,
- potentially halving the local storage needed to search your data.
- {search-snaps-cap} rely on the same snapshot mechanism you already use for
- backups and have minimal impact on your snapshot repository storage costs.
- [discrete]
- [[using-searchable-snapshots]]
- === Using {search-snaps}
- Searching a {search-snap} index is the same as searching any other index.
- By default, {search-snap} indices have no replicas. The underlying snapshot
- provides resilience and the query volume is expected to be low enough that a
- single shard copy will be sufficient. However, if you need to support a higher
- query volume, you can add replicas by adjusting the `index.number_of_replicas`
- index setting.
- If a node fails and {search-snap} shards need to be recovered elsewhere, there
- is a brief window of time while {es} allocates the shards to other nodes where
- the cluster health will not be `green`. Searches that hit these shards may fail
- or return partial results until the shards are reallocated to healthy nodes.
- You typically manage {search-snaps} through {ilm-init}. The
- <<ilm-searchable-snapshot, searchable snapshots>> action automatically converts
- a regular index into a {search-snap} index when it reaches the `cold` or
- `frozen` phase. You can also make indices in existing snapshots searchable by
- manually mounting them using the <<searchable-snapshots-api-mount-snapshot,
- mount snapshot>> API.
- To mount an index from a snapshot that contains multiple indices, we recommend
- creating a <<clone-snapshot-api, clone>> of the snapshot that contains only the
- index you want to search, and mounting the clone. You should not delete a
- snapshot if it has any mounted indices, so creating a clone enables you to
- manage the lifecycle of the backup snapshot independently of any
- {search-snaps}. If you use {ilm-init} to manage your {search-snaps} then it
- will automatically look after cloning the snapshot as needed.
- You can control the allocation of the shards of {search-snap} indices using the
- same mechanisms as for regular indices. For example, you could use
- <<shard-allocation-filtering>> to restrict {search-snap} shards to a subset of
- your nodes.
- The speed of recovery of a {search-snap} index is limited by the repository
- setting `max_restore_bytes_per_sec` and the node setting
- `indices.recovery.max_bytes_per_sec` just like a normal restore operation. By
- default `max_restore_bytes_per_sec` is unlimited, but the default for
- `indices.recovery.max_bytes_per_sec` depends on the configuration of the node.
- See <<recovery-settings>>.
- We recommend that you <<indices-forcemerge, force-merge>> indices to a single
- segment per shard before taking a snapshot that will be mounted as a
- {search-snap} index. Each read from a snapshot repository takes time and costs
- money, and the fewer segments there are the fewer reads are needed to restore
- the snapshot or to respond to a search.
- [TIP]
- ====
- {search-snaps-cap} are ideal for managing a large archive of historical data.
- Historical information is typically searched less frequently than recent data
- and therefore may not need replicas for their performance benefits.
- For more complex or time-consuming searches, you can use <<async-search>> with
- {search-snaps}.
- ====
- [[searchable-snapshots-repository-types]]
- You can use any of the following repository types with searchable snapshots:
- * {plugins}/repository-s3.html[AWS S3]
- * {plugins}/repository-gcs.html[Google Cloud Storage]
- * {plugins}/repository-azure.html[Azure Blob Storage]
- * {plugins}/repository-hdfs.html[Hadoop Distributed File Store (HDFS)]
- * <<snapshots-filesystem-repository,Shared filesystems>> such as NFS
- You can also use alternative implementations of these repository types, for
- instance
- {plugins}/repository-s3-client.html#repository-s3-compatible-services[Minio],
- as long as they are fully compatible. You can use the <<repo-analysis-api>> API
- to analyze your repository's suitability for use with searchable snapshots.
- [discrete]
- [[how-searchable-snapshots-work]]
- === How {search-snaps} work
- When an index is mounted from a snapshot, {es} allocates its shards to data
- nodes within the cluster. The data nodes then automatically retrieve the
- relevant shard data from the repository onto local storage, based on the
- <<searchable-snapshot-mount-storage-options,mount options>> specified. If
- possible, searches use data from local storage. If the data is not available
- locally, {es} downloads the data that it needs from the snapshot repository.
- If a node holding one of these shards fails, {es} automatically allocates the
- affected shards on another node, and that node restores the relevant shard data
- from the repository. No replicas are needed, and no complicated monitoring or
- orchestration is necessary to restore lost shards. Although searchable snapshot
- indices have no replicas by default, you may add replicas to these indices by
- adjusting `index.number_of_replicas`. Replicas of {search-snap} shards are
- recovered by copying data from the snapshot repository, just like primaries of
- {search-snap} shards. In contrast, replicas of regular indices are restored by
- copying data from the primary.
- [discrete]
- [[searchable-snapshot-mount-storage-options]]
- ==== Mount options
- To search a snapshot, you must first mount it locally as an index. Usually
- {ilm-init} will do this automatically, but you can also call the
- <<searchable-snapshots-api-mount-snapshot,mount snapshot>> API yourself. There
- are two options for mounting a snapshot, each with different performance
- characteristics and local storage footprints:
- [[full-copy]]
- Full copy::
- Loads a full copy of the snapshotted index's shards onto node-local storage
- within the cluster. This is the default mount option. {ilm-init} uses this
- option by default in the `hot` and `cold` phases.
- +
- Search performance for a full-copy searchable snapshot index is normally
- comparable to a regular index, since there is minimal need to access the
- snapshot repository. While recovery is ongoing, search performance may be
- slower than with a regular index because a search may need some data that has
- not yet been retrieved into the local copy. If that happens, {es} will eagerly
- retrieve the data needed to complete the search in parallel with the ongoing
- recovery.
- [[shared-cache]]
- Shared cache::
- +
- experimental::[]
- +
- Uses a local cache containing only recently searched parts of the snapshotted
- index's data. {ilm-init} uses this option by default in the `frozen` phase and
- corresponding frozen tier.
- +
- If a search requires data that is not in the cache, {es} fetches the missing
- data from the snapshot repository. Searches that require these fetches are
- slower, but the fetched data is stored in the cache so that similar searches
- can be served more quickly in future. {es} will evict infrequently used data
- from the cache to free up space.
- +
- Although slower than a full local copy or a regular index, a shared-cache
- searchable snapshot index still returns search results quickly, even for large
- data sets, because the layout of data in the repository is heavily optimized
- for search. Many searches will need to retrieve only a small subset of the
- total shard data before returning results.
- To mount a searchable snapshot index with the shared cache mount option, you
- must configure the `xpack.searchable.snapshot.shared_cache.size` setting to
- reserve space for the cache on one or more nodes. Indices mounted with the
- shared cache mount option are only allocated to nodes that have this setting
- configured.
- [[searchable-snapshots-shared-cache]]
- `xpack.searchable.snapshot.shared_cache.size`::
- (<<static-cluster-setting,Static>>, <<byte-units,byte value>>)
- The size of the space reserved for the shared cache. Defaults to `0b`, meaning
- that the node has no shared cache.
- You can configure the setting in `elasticsearch.yml`:
- [source,yaml]
- ----
- xpack.searchable.snapshot.shared_cache.size: 4TB
- ----
- IMPORTANT: Currently, you can configure
- `xpack.searchable.snapshot.shared_cache.size` on any node. In a future release,
- you will only be able to configure this setting on nodes with the
- <<data-frozen-node,`data_frozen`>> role.
- You can set `xpack.searchable.snapshot.shared_cache.size` to any size between a
- couple of gigabytes up to 90% of available disk space. We only recommend larger
- sizes if you use the node exclusively on a frozen tier or for searchable
- snapshots.
- [discrete]
- [[back-up-restore-searchable-snapshots]]
- === Back up and restore {search-snaps}
- You can use <<snapshot-lifecycle-management,regular snapshots>> to back up a
- cluster containing {search-snap} indices. When you restore a snapshot
- containing {search-snap} indices, these indices are restored as {search-snap}
- indices again.
- Before you restore a snapshot containing a {search-snap} index, you must first
- <<snapshots-register-repository,register the repository>> containing the
- original index snapshot. When restored, the {search-snap} index mounts the
- original index snapshot from its original repository. If wanted, you
- can use separate repositories for regular snapshots and {search-snaps}.
- A snapshot of a {search-snap} index contains only a small amount of metadata
- which identifies its original index snapshot. It does not contain any data from
- the original index. The restore of a backup will fail to restore any
- {search-snap} indices whose original index snapshot is unavailable.
- [discrete]
- [[searchable-snapshots-reliability]]
- === Reliability of {search-snaps}
- The sole copy of the data in a {search-snap} index is the underlying snapshot,
- stored in the repository. If the repository fails or corrupts the contents of
- the snapshot then the data is lost. Although {es} may have made copies of the
- data onto local storage, these copies may be incomplete and cannot be used to
- recover any data after a repository failure. You must make sure that your
- repository is reliable and protects against corruption of your data while it is
- at rest in the repository.
- The blob storage offered by all major public cloud providers typically offers
- very good protection against data loss or corruption. If you manage your own
- repository storage then you are responsible for its reliability.
- [discrete]
- [[searchable-snapshots-frozen-tier-on-cloud]]
- === Configure a frozen tier on {ess}
- The frozen data tier is not yet available on the {ess-trial}[{ess}]. However,
- you can configure another tier to use <<shared-cache,shared snapshot caches>>.
- This effectively recreates a frozen tier in your {ess} deployment. Follow these
- steps:
- . Choose an existing tier to use. Typically, you'll use the cold tier, but the
- hot and warm tiers are also supported. You can use this tier as a shared tier,
- or you can dedicate the tier exclusively to shared snapshot caches.
- . Log in to the {ess-trial}[{ess} Console].
- . Select your deployment from the {ess} home page or the deployments page.
- . From your deployment menu, select **Edit deployment**.
- . On the **Edit** page, click **Edit elasticsearch.yml** under your selected
- {es} tier.
- . In the `elasticsearch.yml` file, add the
- <<searchable-snapshots-shared-cache,`xpack.searchable.snapshot.shared_cache.size`>>
- setting. For example:
- +
- [source,yaml]
- ----
- xpack.searchable.snapshot.shared_cache.size: 50GB
- ----
- . Click **Save** and **Confirm** to apply your configuration changes.
|