123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146 |
- [[index-modules-store]]
- == Store
- The store module allows you to control how index data is stored and accessed on disk.
- NOTE: This is a low-level setting. Some store implementations have poor
- concurrency or disable optimizations for heap memory usage. We recommend
- sticking to the defaults.
- [float]
- [[file-system]]
- === File system storage types
- There are different file system implementations or _storage types_. By default,
- Elasticsearch will pick the best implementation based on the operating
- environment.
- The storage type can also be explicitly set for all indices by configuring the
- store type in the `config/elasticsearch.yml` file:
- [source,yaml]
- ---------------------------------
- index.store.type: hybridfs
- ---------------------------------
- It is a _static_ setting that can be set on a per-index basis at index
- creation time:
- [source,console]
- ---------------------------------
- PUT /my_index
- {
- "settings": {
- "index.store.type": "hybridfs"
- }
- }
- ---------------------------------
- WARNING: This is an expert-only setting and may be removed in the future.
- The following sections lists all the different storage types supported.
- `fs`::
- Default file system implementation. This will pick the best implementation
- depending on the operating environment, which is currently `hybridfs` on all
- supported systems but is subject to change.
- [[simplefs]]`simplefs`::
- The Simple FS type is a straightforward implementation of file system
- storage (maps to Lucene `SimpleFsDirectory`) using a random access file.
- This implementation has poor concurrent performance (multiple threads
- will bottleneck) and disables some optimizations for heap memory usage.
- [[niofs]]`niofs`::
- The NIO FS type stores the shard index on the file system (maps to
- Lucene `NIOFSDirectory`) using NIO. It allows multiple threads to read
- from the same file concurrently. It is not recommended on Windows
- because of a bug in the SUN Java implementation and disables some
- optimizations for heap memory usage.
- [[mmapfs]]`mmapfs`::
- The MMap FS type stores the shard index on the file system (maps to
- Lucene `MMapDirectory`) by mapping a file into memory (mmap). Memory
- mapping uses up a portion of the virtual memory address space in your
- process equal to the size of the file being mapped. Before using this
- class, be sure you have allowed plenty of
- <<vm-max-map-count,virtual address space>>.
- [[hybridfs]]`hybridfs`::
- The `hybridfs` type is a hybrid of `niofs` and `mmapfs`, which chooses the best
- file system type for each type of file based on the read access pattern.
- Currently only the Lucene term dictionary, norms and doc values files are
- memory mapped. All other files are opened using Lucene `NIOFSDirectory`.
- Similarly to `mmapfs` be sure you have allowed plenty of
- <<vm-max-map-count,virtual address space>>.
- [[allow-mmap]]
- You can restrict the use of the `mmapfs` and the related `hybridfs` store type
- via the setting `node.store.allow_mmap`. This is a boolean setting indicating
- whether or not memory-mapping is allowed. The default is to allow it. This
- setting is useful, for example, if you are in an environment where you can not
- control the ability to create a lot of memory maps so you need disable the
- ability to use memory-mapping.
- [[preload-data-to-file-system-cache]]
- === Preloading data into the file system cache
- NOTE: This is an expert setting, the details of which may change in the future.
- By default, Elasticsearch completely relies on the operating system file system
- cache for caching I/O operations. It is possible to set `index.store.preload`
- in order to tell the operating system to load the content of hot index
- files into memory upon opening. This setting accept a comma-separated list of
- files extensions: all files whose extension is in the list will be pre-loaded
- upon opening. This can be useful to improve search performance of an index,
- especially when the host operating system is restarted, since this causes the
- file system cache to be trashed. However note that this may slow down the
- opening of indices, as they will only become available after data have been
- loaded into physical memory.
- This setting is best-effort only and may not work at all depending on the store
- type and host operating system.
- The `index.store.preload` is a static setting that can either be set in the
- `config/elasticsearch.yml`:
- [source,yaml]
- ---------------------------------
- index.store.preload: ["nvd", "dvd"]
- ---------------------------------
- or in the index settings at index creation time:
- [source,console]
- ---------------------------------
- PUT /my_index
- {
- "settings": {
- "index.store.preload": ["nvd", "dvd"]
- }
- }
- ---------------------------------
- The default value is the empty array, which means that nothing will be loaded
- into the file-system cache eagerly. For indices that are actively searched,
- you might want to set it to `["nvd", "dvd"]`, which will cause norms and doc
- values to be loaded eagerly into physical memory. These are the two first
- extensions to look at since Elasticsearch performs random access on them.
- A wildcard can be used in order to indicate that all files should be preloaded:
- `index.store.preload: ["*"]`. Note however that it is generally not useful to
- load all files into memory, in particular those for stored fields and term
- vectors, so a better option might be to set it to
- `["nvd", "dvd", "tim", "doc", "dim"]`, which will preload norms, doc values,
- terms dictionaries, postings lists and points, which are the most important
- parts of the index for search and aggregations.
- Note that this setting can be dangerous on indices that are larger than the size
- of the main memory of the host, as it would cause the filesystem cache to be
- trashed upon reopens after large merges, which would make indexing and searching
- _slower_.
|