123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216 |
- [[important-settings]]
- == Important Elasticsearch configuration
- While Elasticsearch requires very little configuration, there are a number of
- settings which need to be configured manually and should definitely be
- configured before going into production.
- * <<path-settings,`path.data` and `path.logs`>>
- * <<cluster.name,`cluster.name`>>
- * <<node.name,`node.name`>>
- * <<bootstrap.mlockall,`bootstrap.mlockall`>>
- * <<network.host,`network.host`>>
- * <<unicast.hosts,`discovery.zen.ping.unicast.hosts`>>
- * <<minimum_master_nodes,`discovery.zen.minimum_master_nodes`>>
- * <<node.max_local_storage_nodes,`node.max_local_storage_nodes`>>
- [float]
- [[path-settings]]
- === `path.data` and `path.logs`
- If you are using the `.zip` or `.tar.gz` archives, the `data` and `logs`
- directories are sub-folders of `$ES_HOME`. If these important folders are
- left in their default locations, there is a high risk of them being deleted
- while upgrading Elasticsearch to a new version.
- In production use, you will almost certainly want to change the locations of
- the data and log folder:
- [source,yaml]
- --------------------------------------------------
- path:
- logs: /var/log/elasticsearch
- data: /var/data/elasticsearch
- --------------------------------------------------
- The RPM and Debian distributions already use custom paths for `data` and
- `logs`.
- The `path.data` settings can be set to multiple paths, in which case all paths
- will be used to store data (although the files belonging to a single shard
- will all be stored on the same data path):
- [source,yaml]
- --------------------------------------------------
- path:
- data:
- - /mnt/elasticsearch_1
- - /mnt/elasticsearch_2
- - /mnt/elasticsearch_3
- --------------------------------------------------
- [float]
- [[cluster.name]]
- === `cluster.name`
- A node can only join a cluster when it shares its `cluster.name` with all the
- other nodes in the cluster. The default name is `elasticsearch`, but you
- should change it to an appropriate name which describes the purpose of the
- cluster.
- [source,yaml]
- --------------------------------------------------
- cluster.name: logging-prod
- --------------------------------------------------
- Make sure that you don't reuse the same cluster names in different
- environments, otherwise you might end up with nodes joining the wrong cluster.
- [float]
- [[node.name]]
- === `node.name`
- By default, Elasticsearch will randomly pick a descriptive `node.name` from a
- list of around 3000 Marvel characters when your node starts up, but this also
- means that the `node.name` will change the next time the node restarts.
- It is worth configuring a more meaningful name which will also have the
- advantage of persisting after restarting the node:
- [source,yaml]
- --------------------------------------------------
- node.name: prod-data-2
- --------------------------------------------------
- The `node.name` can also be set to the server's HOSTNAME as follows:
- [source,yaml]
- --------------------------------------------------
- node.name: ${HOSTNAME}
- --------------------------------------------------
- [float]
- [[bootstrap.mlockall]]
- === `bootstrap.mlockall`
- It is vitally important to the health of your node that none of the JVM is
- ever swapped out to disk. One way of achieving that is set the
- `bootstrap.mlockall` setting to `true`.
- For this setting to have effect, other system settings need to be configured
- first. See <<mlockall>> for more details about how to set up memory locking
- correctly.
- [float]
- [[network.host]]
- === `network.host`
- By default, Elasticsearch binds to loopback addresses only -- e.g. `127.0.0.1`
- and `[::1]`. This is sufficient to run a single development node on a server.
- TIP: In fact, more than one node can be started from the same `$ES_HOME` location
- on a single node. This can be useful for testing Elasticsearch's ability to
- form clusters, but it is not a configuration recommended for production.
- In order to communicate and to form a cluster with nodes on other servers,
- your node will need to bind to a non-loopback address. While there are many
- <<modules-network,network settings>>, usually all you need to configure is
- `network.host`:
- [source,yaml]
- --------------------------------------------------
- network.host: 192.168.1.10
- --------------------------------------------------
- The `network.host` setting also understands some special values such as
- `_local_`, `_site_`, `_global_` and modifiers like `:ip4` and `:ip6`, details
- of which can be found in <<network-interface-values>>.
- IMPORTANT: As soon you provide a custom setting for `network.host`,
- Elasticsearch assumes that you are moving from development mode to production
- mode, and upgrades a number of system startup checks from warnings to
- exceptions. See <<dev-vs-prod>> for more information.
- [float]
- [[unicast.hosts]]
- === `discovery.zen.ping.unicast.hosts`
- Out of the box, without any network configuration, Elasticsearch will bind to
- the available loopback addresses and will scan ports 9300 to 9305 to try to
- connect to other nodes running on the same server. This provides an auto-
- clustering experience without having to do any configuration.
- When the moment comes to form a cluster with nodes on other servers, you have
- to provide a seed list of other nodes in the cluster that are likely to be
- live and contactable. This can be specified as follows:
- [source,yaml]
- --------------------------------------------------
- discovery.zen.ping.unicast.hosts:
- - 192.168.1.10:9300
- - 192.168.1.11 <1>
- - seeds.mydomain.com <2>
- --------------------------------------------------
- <1> The port will default to 9300 if not specified.
- <2> A hostname that resolves to multiple IP addresses will try all resolved addresses.
- [float]
- [[minimum_master_nodes]]
- === `discovery.zen.minimum_master_nodes`
- To prevent data loss, it is vital to configure the
- `discovery.zen.minimum_master_nodes setting` so that each master-eligible node
- knows the _minimum number of master-eligible nodes_ that must be visible in
- order to form a cluster.
- Without this setting, a cluster that suffers a network failure is at risk of
- having the cluster split into two independent clusters -- a split brain --
- which will lead to data loss. A more detailed explanation is provided
- in <<split-brain>>.
- To avoid a split brain, this setting should be set to a _quorum_ of master-
- eligible nodes:
- (master_eligible_nodes / 2) + 1
- In other words, if there are three master-eligible nodes, then minimum master
- nodes should be set to `(3 / 2) + 1` or `2`:
- [source,yaml]
- --------------------------------------------------
- discovery.zen.minimum_master_nodes: 2
- --------------------------------------------------
- IMPORTANT: If `discovery.zen.minimum_master_nodes` is not set when
- Elasticsearch is running in <<dev-vs-prod,production mode>>, an exception will
- be thrown which will prevent the node from starting.
- [float]
- [[node.max_local_storage_nodes]]
- === `node.max_local_storage_nodes`
- It is possible to start more than one node on the same server from the same
- `$ES_HOME`, just by doing the following:
- [source,sh]
- --------------------------------------------------
- ./bin/elasticsearch -d
- ./bin/elasticsearch -d
- --------------------------------------------------
- This works just fine: the data directory structure is designed to let multiple
- nodes coexist. However, a single instance of Elasticsearch is able to use all
- of the resources of a single server and it seldom makes sense to run multiple
- nodes on the same server in production.
- It is, however, possible to start more than one node on the same server by
- mistake and to be completely unaware that this problem exists. To prevent more
- than one node from sharing the same data directory, it is advisable to add the
- following setting:
- [source,yaml]
- --------------------------------------------------
- node.max_local_storage_nodes: 1
- --------------------------------------------------
|