configuration.asciidoc 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385
  1. [[setup-configuration]]
  2. == Configuration
  3. [float]
  4. === Environment Variables
  5. Within the scripts, Elasticsearch comes with built in `JAVA_OPTS` passed
  6. to the JVM started. The most important setting for that is the `-Xmx` to
  7. control the maximum allowed memory for the process, and `-Xms` to
  8. control the minimum allocated memory for the process (_in general, the
  9. more memory allocated to the process, the better_).
  10. Most times it is better to leave the default `JAVA_OPTS` as they are,
  11. and use the `ES_JAVA_OPTS` environment variable in order to set / change
  12. JVM settings or arguments.
  13. The `ES_HEAP_SIZE` environment variable allows to set the heap memory
  14. that will be allocated to elasticsearch java process. It will allocate
  15. the same value to both min and max values, though those can be set
  16. explicitly (not recommended) by setting `ES_MIN_MEM` (defaults to
  17. `256m`), and `ES_MAX_MEM` (defaults to `1g`).
  18. It is recommended to set the min and max memory to the same value, and
  19. enable <<setup-configuration-memory,`mlockall`>>.
  20. [float]
  21. [[system]]
  22. === System Configuration
  23. [float]
  24. [[file-descriptors]]
  25. ==== File Descriptors
  26. Make sure to increase the number of open files descriptors on the
  27. machine (or for the user running elasticsearch). Setting it to 32k or
  28. even 64k is recommended.
  29. You can retrieve the `max_file_descriptors` for each node
  30. using the <<cluster-nodes-info>> API, with:
  31. [source,js]
  32. --------------------------------------------------
  33. curl localhost:9200/_nodes/stats/process?pretty
  34. --------------------------------------------------
  35. [float]
  36. [[max-number-of-threads]]
  37. ==== Number of threads
  38. Make sure that the number of threads that the Elasticsearch user can
  39. create is at least 2048.
  40. [float]
  41. [[vm-max-map-count]]
  42. ==== Virtual memory
  43. Elasticsearch uses a <<default_fs,`hybrid mmapfs / niofs`>> directory by default to store its indices. The default
  44. operating system limits on mmap counts is likely to be too low, which may
  45. result in out of memory exceptions. On Linux, you can increase the limits by
  46. running the following command as `root`:
  47. [source,sh]
  48. -------------------------------------
  49. sysctl -w vm.max_map_count=262144
  50. -------------------------------------
  51. To set this value permanently, update the `vm.max_map_count` setting in
  52. `/etc/sysctl.conf`.
  53. NOTE: If you installed Elasticsearch using a package (.deb, .rpm) this setting will be changed automatically. To verify, run `sysctl vm.max_map_count`.
  54. [float]
  55. [[setup-configuration-memory]]
  56. ==== Memory Settings
  57. Most operating systems try to use as much memory as possible for file system
  58. caches and eagerly swap out unused application memory, possibly resulting
  59. in the elasticsearch process being swapped. Swapping is very bad for
  60. performance and for node stability, so it should be avoided at all costs.
  61. There are three options:
  62. * **Disable swap**
  63. +
  64. --
  65. The simplest option is to completely disable swap. Usually Elasticsearch
  66. is the only service running on a box, and its memory usage is controlled
  67. by the `ES_HEAP_SIZE` environment variable. There should be no need
  68. to have swap enabled.
  69. On Linux systems, you can disable swap temporarily
  70. by running: `sudo swapoff -a`. To disable it permanently, you will need
  71. to edit the `/etc/fstab` file and comment out any lines that contain the
  72. word `swap`.
  73. On Windows, the equivalent can be achieved by disabling the paging file entirely
  74. via `System Properties → Advanced → Performance → Advanced → Virtual memory`.
  75. --
  76. * **Configure `swappiness`**
  77. +
  78. --
  79. The second option is to ensure that the sysctl value `vm.swappiness` is set
  80. to `0`. This reduces the kernel's tendency to swap and should not lead to
  81. swapping under normal circumstances, while still allowing the whole system
  82. to swap in emergency conditions.
  83. NOTE: From kernel version 3.5-rc1 and above, a `swappiness` of `0` will
  84. cause the OOM killer to kill the process instead of allowing swapping.
  85. You will need to set `swappiness` to `1` to still allow swapping in
  86. emergencies.
  87. --
  88. * **`mlockall`**
  89. +
  90. --
  91. The third option is to use
  92. http://opengroup.org/onlinepubs/007908799/xsh/mlockall.html[mlockall] on Linux/Unix systems, or https://msdn.microsoft.com/en-us/library/windows/desktop/aa366895%28v=vs.85%29.aspx[VirtualLock] on Windows, to
  93. try to lock the process address space into RAM, preventing any Elasticsearch
  94. memory from being swapped out. This can be done, by adding this line
  95. to the `config/elasticsearch.yml` file:
  96. [source,yaml]
  97. --------------
  98. bootstrap.mlockall: true
  99. --------------
  100. After starting Elasticsearch, you can see whether this setting was applied
  101. successfully by checking the value of `mlockall` in the output from this
  102. request:
  103. [source,sh]
  104. --------------
  105. curl http://localhost:9200/_nodes/process?pretty
  106. --------------
  107. If you see that `mlockall` is `false`, then it means that the `mlockall`
  108. request has failed. The most probable reason, on Linux/Unix systems, is that
  109. the user running Elasticsearch doesn't have permission to lock memory. This can
  110. be granted by running `ulimit -l unlimited` as `root` before starting Elasticsearch.
  111. Another possible reason why `mlockall` can fail is that the temporary directory
  112. (usually `/tmp`) is mounted with the `noexec` option. This can be solved by
  113. specifying a new temp directory, by starting Elasticsearch with:
  114. [source,sh]
  115. --------------
  116. ./bin/elasticsearch -Djna.tmpdir=/path/to/new/dir
  117. --------------
  118. WARNING: `mlockall` might cause the JVM or shell session to exit if it tries
  119. to allocate more memory than is available!
  120. --
  121. [float]
  122. [[settings]]
  123. === Elasticsearch Settings
  124. *elasticsearch* configuration files can be found under `ES_HOME/config`
  125. folder. The folder comes with two files, the `elasticsearch.yml` for
  126. configuring Elasticsearch different
  127. <<modules,modules>>, and `logging.yml` for
  128. configuring the Elasticsearch logging.
  129. The configuration format is http://www.yaml.org/[YAML]. Here is an
  130. example of changing the address all network based modules will use to
  131. bind and publish to:
  132. [source,yaml]
  133. --------------------------------------------------
  134. network :
  135. host : 10.0.0.4
  136. --------------------------------------------------
  137. [float]
  138. [[paths]]
  139. ==== Paths
  140. In production use, you will almost certainly want to change paths for
  141. data and log files:
  142. [source,yaml]
  143. --------------------------------------------------
  144. path:
  145. logs: /var/log/elasticsearch
  146. data: /var/data/elasticsearch
  147. --------------------------------------------------
  148. [float]
  149. [[cluster-name]]
  150. ==== Cluster name
  151. Also, don't forget to give your production cluster a name, which is used
  152. to discover and auto-join other nodes:
  153. [source,yaml]
  154. --------------------------------------------------
  155. cluster:
  156. name: <NAME OF YOUR CLUSTER>
  157. --------------------------------------------------
  158. Make sure that you don't reuse the same cluster names in different
  159. environments, otherwise you might end up with nodes joining the wrong cluster.
  160. For instance you could use `logging-dev`, `logging-stage`, and `logging-prod`
  161. for the development, staging, and production clusters.
  162. [float]
  163. [[node-name]]
  164. ==== Node name
  165. You may also want to change the default node name for each node to
  166. something like the display hostname. By default Elasticsearch will
  167. randomly pick a Marvel character name from a list of around 3000 names
  168. when your node starts up.
  169. [source,yaml]
  170. --------------------------------------------------
  171. node:
  172. name: <NAME OF YOUR NODE>
  173. --------------------------------------------------
  174. The hostname of the machine is provided in the environment
  175. variable `HOSTNAME`. If on your machine you only run a
  176. single elasticsearch node for that cluster, you can set
  177. the node name to the hostname using the `${...}` notation:
  178. [source,yaml]
  179. --------------------------------------------------
  180. node:
  181. name: ${HOSTNAME}
  182. --------------------------------------------------
  183. Internally, all settings are collapsed into "namespaced" settings. For
  184. example, the above gets collapsed into `node.name`. This means that
  185. its easy to support other configuration formats, for example,
  186. http://www.json.org[JSON]. If JSON is a preferred configuration format,
  187. simply rename the `elasticsearch.yml` file to `elasticsearch.json` and
  188. add:
  189. [float]
  190. [[styles]]
  191. ==== Configuration styles
  192. [source,yaml]
  193. --------------------------------------------------
  194. {
  195. "network" : {
  196. "host" : "10.0.0.4"
  197. }
  198. }
  199. --------------------------------------------------
  200. It also means that its easy to provide the settings externally either
  201. using the `ES_JAVA_OPTS` or as parameters to the `elasticsearch`
  202. command, for example:
  203. [source,sh]
  204. --------------------------------------------------
  205. $ elasticsearch -Ees.network.host=10.0.0.4
  206. --------------------------------------------------
  207. Another option is to set `es.default.` prefix instead of `es.` prefix,
  208. which means the default setting will be used only if not explicitly set
  209. in the configuration file.
  210. Another option is to use the `${...}` notation within the configuration
  211. file which will resolve to an environment setting, for example:
  212. [source,js]
  213. --------------------------------------------------
  214. {
  215. "network" : {
  216. "host" : "${ES_NET_HOST}"
  217. }
  218. }
  219. --------------------------------------------------
  220. Additionally, for settings that you do not wish to store in the configuration
  221. file, you can use the value `${prompt.text}` or `${prompt.secret}` and start
  222. Elasticsearch in the foreground. `${prompt.secret}` has echoing disabled so
  223. that the value entered will not be shown in your terminal; `${prompt.text}`
  224. will allow you to see the value as you type it in. For example:
  225. [source,yaml]
  226. --------------------------------------------------
  227. node:
  228. name: ${prompt.text}
  229. --------------------------------------------------
  230. On execution of the `elasticsearch` command, you will be prompted to enter
  231. the actual value like so:
  232. [source,sh]
  233. --------------------------------------------------
  234. Enter value for [node.name]:
  235. --------------------------------------------------
  236. NOTE: Elasticsearch will not start if `${prompt.text}` or `${prompt.secret}`
  237. is used in the settings and the process is run as a service or in the background.
  238. [float]
  239. [[configuration-index-settings]]
  240. === Index Settings
  241. Indices created within the cluster can provide their own settings. For
  242. example, the following creates an index with a refresh interval of 5
  243. seconds instead of the default refresh interval (the format can be either
  244. YAML or JSON):
  245. [source,sh]
  246. --------------------------------------------------
  247. $ curl -XPUT http://localhost:9200/kimchy/ -d \
  248. '
  249. index:
  250. refresh_interval: 5s
  251. '
  252. --------------------------------------------------
  253. Index level settings can be set on the node level as well, for example,
  254. within the `elasticsearch.yml` file, the following can be set:
  255. [source,yaml]
  256. --------------------------------------------------
  257. index :
  258. refresh_interval: 5s
  259. --------------------------------------------------
  260. This means that every index that gets created on the specific node
  261. started with the mentioned configuration will use a refresh interval of
  262. 5 seconds *unless the index explicitly sets it*. In other words, any
  263. index level settings override what is set in the node configuration. Of
  264. course, the above can also be set as a "collapsed" setting, for example:
  265. [source,sh]
  266. --------------------------------------------------
  267. $ elasticsearch -Ees.index.refresh_interval=5s
  268. --------------------------------------------------
  269. All of the index level configuration can be found within each
  270. <<index-modules,index module>>.
  271. [float]
  272. [[logging]]
  273. === Logging
  274. Elasticsearch uses an internal logging abstraction and comes, out of the
  275. box, with http://logging.apache.org/log4j/1.2/[log4j]. It tries to simplify
  276. log4j configuration by using http://www.yaml.org/[YAML] to configure it,
  277. and the logging configuration file is `config/logging.yml`. The
  278. http://en.wikipedia.org/wiki/JSON[JSON] and
  279. http://en.wikipedia.org/wiki/.properties[properties] formats are also
  280. supported. Multiple configuration files can be loaded, in which case they will
  281. get merged, as long as they start with the `logging.` prefix and end with one
  282. of the supported suffixes (either `.yml`, `.yaml`, `.json` or `.properties`).
  283. The logger section contains the java packages and their corresponding log
  284. level, where it is possible to omit the `org.elasticsearch` prefix. The
  285. appender section contains the destinations for the logs. Extensive information
  286. on how to customize logging and all the supported appenders can be found on
  287. the http://logging.apache.org/log4j/1.2/manual.html[log4j documentation].
  288. Additional Appenders and other logging classes provided by
  289. http://logging.apache.org/log4j/extras/[log4j-extras] are also available,
  290. out of the box.
  291. [float]
  292. [[deprecation-logging]]
  293. ==== Deprecation logging
  294. In addition to regular logging, Elasticsearch allows you to enable logging
  295. of deprecated actions. For example this allows you to determine early, if
  296. you need to migrate certain functionality in the future. By default,
  297. deprecation logging is disabled. You can enable it in the `config/logging.yml`
  298. file by setting the deprecation log level to `DEBUG`.
  299. [source,yaml]
  300. --------------------------------------------------
  301. deprecation: DEBUG, deprecation_log_file
  302. --------------------------------------------------
  303. This will create a daily rolling deprecation log file in your log directory.
  304. Check this file regularly, especially when you intend to upgrade to a new
  305. major version.