123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390 |
- [[setup-configuration]]
- == Configuration
- [float]
- === Environment Variables
- Within the scripts, Elasticsearch comes with built in `JAVA_OPTS` passed
- to the JVM started. The most important setting for that is the `-Xmx` to
- control the maximum allowed memory for the process, and `-Xms` to
- control the minimum allocated memory for the process (_in general, the
- more memory allocated to the process, the better_).
- Most times it is better to leave the default `JAVA_OPTS` as they are,
- and use the `ES_JAVA_OPTS` environment variable in order to set / change
- JVM settings or arguments.
- The `ES_HEAP_SIZE` environment variable allows to set the heap memory
- that will be allocated to elasticsearch java process. It will allocate
- the same value to both min and max values, though those can be set
- explicitly (not recommended) by setting `ES_MIN_MEM` (defaults to
- `256m`), and `ES_MAX_MEM` (defaults to `1g`).
- It is recommended to set the min and max memory to the same value, and
- enable <<setup-configuration-memory,`mlockall`>>.
- [float]
- [[system]]
- === System Configuration
- [float]
- [[file-descriptors]]
- ==== File Descriptors
- Make sure to increase the number of open files descriptors on the
- machine (or for the user running elasticsearch). Setting it to 32k or
- even 64k is recommended.
- In order to test how many open files the process can open, start it with
- `-Des.max-open-files` set to `true`. This will print the number of open
- files the process can open on startup.
- Alternatively, you can retrieve the `max_file_descriptors` for each node
- using the <<cluster-nodes-info>> API, with:
- [source,js]
- --------------------------------------------------
- curl localhost:9200/_nodes/process?pretty
- --------------------------------------------------
- [float]
- [[vm-max-map-count]]
- ==== Virtual memory
- Elasticsearch uses a <<default_fs,`hybrid mmapfs / niofs`>> directory by default to store its indices. The default
- operating system limits on mmap counts is likely to be too low, which may
- result in out of memory exceptions. On Linux, you can increase the limits by
- running the following command as `root`:
- [source,sh]
- -------------------------------------
- sysctl -w vm.max_map_count=262144
- -------------------------------------
- To set this value permanently, update the `vm.max_map_count` setting in
- `/etc/sysctl.conf`.
- NOTE: If you installed Elasticsearch using a package (.deb, .rpm) this setting will be changed automatically. To verify, run `sysctl vm.max_map_count`.
- [float]
- [[setup-configuration-memory]]
- ==== Memory Settings
- Most operating systems try to use as much memory as possible for file system
- caches and eagerly swap out unused application memory, possibly resulting
- in the elasticsearch process being swapped. Swapping is very bad for
- performance and for node stability, so it should be avoided at all costs.
- There are three options:
- * **Disable swap**
- +
- --
- The simplest option is to completely disable swap. Usually Elasticsearch
- is the only service running on a box, and its memory usage is controlled
- by the `ES_HEAP_SIZE` environment variable. There should be no need
- to have swap enabled.
- On Linux systems, you can disable swap temporarily
- by running: `sudo swapoff -a`. To disable it permanently, you will need
- to edit the `/etc/fstab` file and comment out any lines that contain the
- word `swap`.
- On Windows, the equivalent can be achieved by disabling the paging file entirely
- via `System Properties → Advanced → Performance → Advanced → Virtual memory`.
- --
- * **Configure `swappiness`**
- +
- --
- The second option is to ensure that the sysctl value `vm.swappiness` is set
- to `0`. This reduces the kernel's tendency to swap and should not lead to
- swapping under normal circumstances, while still allowing the whole system
- to swap in emergency conditions.
- NOTE: From kernel version 3.5-rc1 and above, a `swappiness` of `0` will
- cause the OOM killer to kill the process instead of allowing swapping.
- You will need to set `swappiness` to `1` to still allow swapping in
- emergencies.
- --
- * **`mlockall`**
- +
- --
- The third option is to use
- http://opengroup.org/onlinepubs/007908799/xsh/mlockall.html[mlockall] on Linux/Unix systems, or https://msdn.microsoft.com/en-us/library/windows/desktop/aa366895%28v=vs.85%29.aspx[VirtualLock] on Windows, to
- try to lock the process address space into RAM, preventing any Elasticsearch
- memory from being swapped out. This can be done, by adding this line
- to the `config/elasticsearch.yml` file:
- [source,yaml]
- --------------
- bootstrap.mlockall: true
- --------------
- After starting Elasticsearch, you can see whether this setting was applied
- successfully by checking the value of `mlockall` in the output from this
- request:
- [source,sh]
- --------------
- curl http://localhost:9200/_nodes/process?pretty
- --------------
- If you see that `mlockall` is `false`, then it means that the the `mlockall`
- request has failed. The most probable reason, on Linux/Unix systems, is that
- the user running Elasticsearch doesn't have permission to lock memory. This can
- be granted by running `ulimit -l unlimited` as `root` before starting Elasticsearch.
- Another possible reason why `mlockall` can fail is that the temporary directory
- (usually `/tmp`) is mounted with the `noexec` option. This can be solved by
- specifying a new temp directory, by starting Elasticsearch with:
- [source,sh]
- --------------
- ./bin/elasticsearch -Djna.tmpdir=/path/to/new/dir
- --------------
- WARNING: `mlockall` might cause the JVM or shell session to exit if it tries
- to allocate more memory than is available!
- --
- [float]
- [[settings]]
- === Elasticsearch Settings
- *elasticsearch* configuration files can be found under `ES_HOME/config`
- folder. The folder comes with two files, the `elasticsearch.yml` for
- configuring Elasticsearch different
- <<modules,modules>>, and `logging.yml` for
- configuring the Elasticsearch logging.
- The configuration format is http://www.yaml.org/[YAML]. Here is an
- example of changing the address all network based modules will use to
- bind and publish to:
- [source,yaml]
- --------------------------------------------------
- network :
- host : 10.0.0.4
- --------------------------------------------------
- [float]
- [[paths]]
- ==== Paths
- In production use, you will almost certainly want to change paths for
- data and log files:
- [source,yaml]
- --------------------------------------------------
- path:
- logs: /var/log/elasticsearch
- data: /var/data/elasticsearch
- --------------------------------------------------
- [float]
- [[cluster-name]]
- ==== Cluster name
- Also, don't forget to give your production cluster a name, which is used
- to discover and auto-join other nodes:
- [source,yaml]
- --------------------------------------------------
- cluster:
- name: <NAME OF YOUR CLUSTER>
- --------------------------------------------------
- Make sure that you don't reuse the same cluster names in different
- environments, otherwise you might end up with nodes joining the wrong cluster.
- For instance you could use `logging-dev`, `logging-stage`, and `logging-prod`
- for the development, staging, and production clusters.
- [float]
- [[node-name]]
- ==== Node name
- You may also want to change the default node name for each node to
- something like the display hostname. By default Elasticsearch will
- randomly pick a Marvel character name from a list of around 3000 names
- when your node starts up.
- [source,yaml]
- --------------------------------------------------
- node:
- name: <NAME OF YOUR NODE>
- --------------------------------------------------
- The hostname of the machine is provided in the environment
- variable `HOSTNAME`. If on your machine you only run a
- single elasticsearch node for that cluster, you can set
- the node name to the hostname using the `${...}` notation:
- [source,yaml]
- --------------------------------------------------
- node:
- name: ${HOSTNAME}
- --------------------------------------------------
- Internally, all settings are collapsed into "namespaced" settings. For
- example, the above gets collapsed into `node.name`. This means that
- its easy to support other configuration formats, for example,
- http://www.json.org[JSON]. If JSON is a preferred configuration format,
- simply rename the `elasticsearch.yml` file to `elasticsearch.json` and
- add:
- [float]
- [[styles]]
- ==== Configuration styles
- [source,yaml]
- --------------------------------------------------
- {
- "network" : {
- "host" : "10.0.0.4"
- }
- }
- --------------------------------------------------
- It also means that its easy to provide the settings externally either
- using the `ES_JAVA_OPTS` or as parameters to the `elasticsearch`
- command, for example:
- [source,sh]
- --------------------------------------------------
- $ elasticsearch -Des.network.host=10.0.0.4
- --------------------------------------------------
- Another option is to set `es.default.` prefix instead of `es.` prefix,
- which means the default setting will be used only if not explicitly set
- in the configuration file.
- Another option is to use the `${...}` notation within the configuration
- file which will resolve to an environment setting, for example:
- [source,js]
- --------------------------------------------------
- {
- "network" : {
- "host" : "${ES_NET_HOST}"
- }
- }
- --------------------------------------------------
- Additionally, for settings that you do not wish to store in the configuration
- file, you can use the value `${prompt.text}` or `${prompt.secret}` and start
- Elasticsearch in the foreground. `${prompt.secret}` has echoing disabled so
- that the value entered will not be shown in your terminal; `${prompt.text}`
- will allow you to see the value as you type it in. For example:
- [source,yaml]
- --------------------------------------------------
- node:
- name: ${prompt.text}
- --------------------------------------------------
- On execution of the `elasticsearch` command, you will be prompted to enter
- the actual value like so:
- [source,sh]
- --------------------------------------------------
- Enter value for [node.name]:
- --------------------------------------------------
- NOTE: Elasticsearch will not start if `${prompt.text}` or `${prompt.secret}`
- is used in the settings and the process is run as a service or in the background.
- The location of the configuration file can be set externally using a
- system property:
- [source,sh]
- --------------------------------------------------
- $ elasticsearch -Des.config=/path/to/config/file
- --------------------------------------------------
- [float]
- [[configuration-index-settings]]
- === Index Settings
- Indices created within the cluster can provide their own settings. For
- example, the following creates an index with memory based storage
- instead of the default file system based one (the format can be either
- YAML or JSON):
- [source,sh]
- --------------------------------------------------
- $ curl -XPUT http://localhost:9200/kimchy/ -d \
- '
- index:
- refresh_interval: 5s
- '
- --------------------------------------------------
- Index level settings can be set on the node level as well, for example,
- within the `elasticsearch.yml` file, the following can be set:
- [source,yaml]
- --------------------------------------------------
- index :
- refresh_interval: 5s
- --------------------------------------------------
- This means that every index that gets created on the specific node
- started with the mentioned configuration will store the index in memory
- *unless the index explicitly sets it*. In other words, any index level
- settings override what is set in the node configuration. Of course, the
- above can also be set as a "collapsed" setting, for example:
- [source,sh]
- --------------------------------------------------
- $ elasticsearch -Des.index.refresh_interval=5s
- --------------------------------------------------
- All of the index level configuration can be found within each
- <<index-modules,index module>>.
- [float]
- [[logging]]
- === Logging
- Elasticsearch uses an internal logging abstraction and comes, out of the
- box, with http://logging.apache.org/log4j/1.2/[log4j]. It tries to simplify
- log4j configuration by using http://www.yaml.org/[YAML] to configure it,
- and the logging configuration file is `config/logging.yml`. The
- http://en.wikipedia.org/wiki/JSON[JSON] and
- http://en.wikipedia.org/wiki/.properties[properties] formats are also
- supported. Multiple configuration files can be loaded, in which case they will
- get merged, as long as they start with the `logging.` prefix and end with one
- of the supported suffixes (either `.yml`, `.yaml`, `.json` or `.properties`).
- The logger section contains the java packages and their corresponding log
- level, where it is possible to omit the `org.elasticsearch` prefix. The
- appender section contains the destinations for the logs. Extensive information
- on how to customize logging and all the supported appenders can be found on
- the http://logging.apache.org/log4j/1.2/manual.html[log4j documentation].
- Additional Appenders and other logging classes provided by
- http://logging.apache.org/log4j/extras/[log4j-extras] are also available,
- out of the box.
- [float]
- [[deprecation-logging]]
- ==== Deprecation logging
- In addition to regular logging, Elasticsearch allows you to enable logging
- of deprecated actions. For example this allows you to determine early, if
- you need to migrate certain functionality in the future. By default,
- deprecation logging is disabled. You can enable it in the `config/logging.yml`
- file by setting the deprecation log level to `DEBUG`.
- [source,yaml]
- --------------------------------------------------
- deprecation: DEBUG, deprecation_log_file
- --------------------------------------------------
- This will create a daily rolling deprecation log file in your log directory.
- Check this file regularly, especially when you intend to upgrade to a new
- major version.
|