frozen-indices.asciidoc 4.4 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[frozen-indices]]
  4. = Frozen Indices
  5. [partintro]
  6. --
  7. Elasticsearch indices can require a significant amount of memory available in order to be open and searchable. Yet, not all indices need
  8. to be writable at the same time and have different access patterns over time. For example, indices in the time series or logging use cases
  9. are unlikely to be queried once they age out but still need to be kept around for retention policy purposes.
  10. In order to keep indices available and queryable for a longer period but at the same time reduce their hardware requirements they can be transitioned
  11. into a frozen state. Once an index is frozen, all of its transient shard memory (aside from mappings and analyzers)
  12. is moved to persistent storage. This allows for a much higher disk to heap storage ratio on individual nodes. Once an index is
  13. frozen, it is made read-only and drops its transient data structures from memory. These data structures will need to be reloaded on demand (and subsequently dropped) for each search request that targets the frozen index. A search request that hits
  14. one or more frozen shards will be executed on a throttled threadpool that ensures that we never search more than
  15. `N` (`1` by default) searches concurrently (see <<search-throttled>>). This protects nodes from exceeding the available memory due to incoming search requests.
  16. In contrast to ordinary open indices, frozen indices are expected to execute slowly and are not designed for high query load. Parallelism is
  17. gained only on a per-node level and loading data-structures on demand is expected to be one or more orders of a magnitude slower than query
  18. execution on a per shard level. Depending on the data in an index, a frozen index may execute searches in the seconds to minutes range, when the same index in an unfrozen state may execute the same search request in milliseconds.
  19. --
  20. == Best Practices
  21. Since frozen indices provide a much higher disk to heap ratio at the expense of search latency, it is advisable to allocate frozen indices to
  22. dedicated nodes to prevent searches on frozen indices influencing traffic on low latency nodes. There is significant overhead in loading
  23. data structures on demand which can cause page faults and garbage collections, which further slow down query execution.
  24. Since indices that are eligible for freezing are unlikely to change in the future, disk space can be optimized as described in <<tune-for-disk-usage>>.
  25. It's highly recommended to <<indices-forcemerge,`_forcemerge`>> your indices prior to freezing to ensure that each shard has only a single
  26. segment on disk. This not only provides much better compression but also simplifies the data structures needed to service aggregation
  27. or sorted search requests.
  28. [source,js]
  29. --------------------------------------------------
  30. POST /twitter/_forcemerge?max_num_segments=1
  31. --------------------------------------------------
  32. // CONSOLE
  33. // TEST[setup:twitter]
  34. == Searching a frozen index
  35. Frozen indices are throttled in order to limit memory consumptions per node. The number of concurrently loaded frozen indices per node is
  36. limited by the number of threads in the <<search-throttled>> threadpool, which is `1` by default.
  37. Search requests will not be executed against frozen indices by default, even if a frozen index is named explicitly. This is
  38. to prevent accidental slowdowns by targeting a frozen index by mistake. To include frozen indices a search request must be executed with
  39. the query parameter `ignore_throttled=false`.
  40. [source,js]
  41. --------------------------------------------------
  42. GET /twitter/_search?q=user:kimchy&ignore_throttled=false
  43. --------------------------------------------------
  44. // CONSOLE
  45. // TEST[setup:twitter]
  46. [IMPORTANT]
  47. ================================
  48. While frozen indices are slow to search, they can be pre-filtered efficiently. The request parameter `pre_filter_shard_size` specifies
  49. a threshold that, when exceeded, will enforce a round-trip to pre-filter search shards that cannot possibly match.
  50. This filter phase can limit the number of shards significantly. For instance, if a date range filter is applied, then all indices (frozen or unfrozen) that do not contain documents within the date range can be skipped efficiently.
  51. The default value for `pre_filter_shard_size` is `128` but it's recommended to set it to `1` when searching frozen indices. There is no
  52. significant overhead associated with this pre-filter phase.
  53. ================================