123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132 |
- [[path-settings]]
- [discrete]
- ==== Path settings
- {es} writes the data you index to indices and data streams to a `data`
- directory. {es} writes its own application logs, which contain information about
- cluster health and operations, to a `logs` directory.
- For <<targz,macOS `.tar.gz`>>, <<targz,Linux `.tar.gz`>>, and
- <<zip-windows,Windows `.zip`>> installations, `data` and `logs` are
- subdirectories of `$ES_HOME` by default. However, files in `$ES_HOME` risk
- deletion during an upgrade.
- In production, we strongly recommend you set the `path.data` and `path.logs` in
- `elasticsearch.yml` to locations outside of `$ES_HOME`. <<docker,Docker>>,
- <<deb,Debian>>, and <<rpm,RPM>> installations write
- data and log to locations outside of `$ES_HOME` by default.
- Supported `path.data` and `path.logs` values vary by platform:
- include::{es-repo-dir}/tab-widgets/customize-data-log-path-widget.asciidoc[]
- include::{es-repo-dir}/modules/node.asciidoc[tag=modules-node-data-path-warning-tag]
- [discrete]
- ==== Multiple data paths
- deprecated::[7.13.0]
- If needed, you can specify multiple paths in `path.data`. {es} stores the node's
- data across all provided paths but keeps each shard's data on the same path.
- {es} does not balance shards across a node's data paths. High disk
- usage in a single path can trigger a <<disk-based-shard-allocation,high disk
- usage watermark>> for the entire node. If triggered, {es} will not add shards to
- the node, even if the node’s other paths have available disk space. If you need
- additional disk space, we recommend you add a new node rather than additional
- data paths.
- include::{es-repo-dir}/tab-widgets/multi-data-path-widget.asciidoc[]
- [discrete]
- [[mdp-migrate]]
- ==== Migrate from multiple data paths
- Support for multiple data paths was deprecated in 7.13 and will be removed
- in a future release.
- As an alternative to multiple data paths, you can create a filesystem which
- spans multiple disks with a hardware virtualisation layer such as RAID, or a
- software virtualisation layer such as Logical Volume Manager (LVM) on Linux or
- Storage Spaces on Windows. If you wish to use multiple data paths on a single
- machine then you must run one node for each data path.
- If you currently use multiple data paths in a
- {ref}/high-availability-cluster-design.html[highly available cluster] then you
- can migrate to a setup that uses a single path for each node without downtime
- using a process similar to a
- {ref}/restart-cluster.html#restart-cluster-rolling[rolling restart]: shut each
- node down in turn and replace it with one or more nodes each configured to use
- a single data path. In more detail, for each node that currently has multiple
- data paths you should follow the following process. In principle you can
- perform this migration during a rolling upgrade to 8.0, but we recommend
- migrating to a single-data-path setup before starting to upgrade.
- 1. Take a snapshot to protect your data in case of disaster.
- 2. Optionally, migrate the data away from the target node by using an
- {ref}/modules-cluster.html#cluster-shard-allocation-filtering[allocation filter]:
- +
- [source,console]
- --------------------------------------------------
- PUT _cluster/settings
- {
- "persistent": {
- "cluster.routing.allocation.exclude._name": "target-node-name"
- }
- }
- --------------------------------------------------
- +
- You can use the {ref}/cat-allocation.html[cat allocation API] to track progress
- of this data migration. If some shards do not migrate then the
- {ref}/cluster-allocation-explain.html[cluster allocation explain API] will help
- you to determine why.
- 3. Follow the steps in the
- {ref}/restart-cluster.html#restart-cluster-rolling[rolling restart process]
- up to and including shutting the target node down.
- 4. Ensure your cluster health is `yellow` or `green`, so that there is a copy
- of every shard assigned to at least one of the other nodes in your cluster.
- 5. If applicable, remove the allocation filter applied in the earlier step.
- +
- [source,console]
- --------------------------------------------------
- PUT _cluster/settings
- {
- "persistent": {
- "cluster.routing.allocation.exclude._name": null
- }
- }
- --------------------------------------------------
- 6. Discard the data held by the stopped node by deleting the contents of its
- data paths.
- 7. Reconfigure your storage. For instance, combine your disks into a single
- filesystem using LVM or Storage Spaces. Ensure that your reconfigured storage
- has sufficient space for the data that it will hold.
- 8. Reconfigure your node by adjusting the `path.data` setting in its
- `elasticsearch.yml` file. If needed, install more nodes each with their own
- `path.data` setting pointing at a separate data path.
- 9. Start the new nodes and follow the rest of the
- {ref}/restart-cluster.html#restart-cluster-rolling[rolling restart process] for
- them.
- 10. Ensure your cluster health is `green`, so that every shard has been
- assigned.
- You can alternatively add some number of single-data-path nodes to your
- cluster, migrate all your data over to these new nodes using
- {ref}/modules-cluster.html#cluster-shard-allocation-filtering[allocation filters],
- and then remove the old nodes from the cluster. This approach will temporarily
- double the size of your cluster so it will only work if you have the capacity to
- expand your cluster like this.
- If you currently use multiple data paths but your cluster is not highly
- available then you can migrate to a non-deprecated configuration by taking
- a snapshot, creating a new cluster with the desired configuration and restoring
- the snapshot into it.
|