bootstrap-checks.asciidoc 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257
  1. [[bootstrap-checks]]
  2. == Bootstrap Checks
  3. Collectively, we have a lot of experience with users suffering
  4. unexpected issues because they have not configured
  5. <<important-settings,important settings>>. In previous versions of
  6. Elasticsearch, misconfiguration of some of these settings were logged
  7. as warnings. Understandably, users sometimes miss these log messages.
  8. To ensure that these settings receive the attention that they deserve,
  9. Elasticsearch has bootstrap checks upon startup.
  10. These bootstrap checks inspect a variety of Elasticsearch and system
  11. settings and compare them to values that are safe for the operation of
  12. Elasticsearch. If Elasticsearch is in development mode, any bootstrap
  13. checks that fail appear as warnings in the Elasticsearch log. If
  14. Elasticsearch is in production mode, any bootstrap checks that fail will
  15. cause Elasticsearch to refuse to start.
  16. There are some bootstrap checks that are always enforced to prevent
  17. Elasticsearch from running with incompatible settings. These checks are
  18. documented individually.
  19. [discrete]
  20. [[dev-vs-prod-mode]]
  21. === Development vs. production mode
  22. By default, {es} binds to loopback addresses for
  23. <<modules-network,HTTP and transport (internal) communication>>. This is fine
  24. for downloading and playing with {es} as well as everyday development,
  25. but it's useless for production systems. To join a cluster, an {es}
  26. node must be reachable via transport communication. To join a cluster via a
  27. non-loopback address, a node must bind transport to a non-loopback address and
  28. not be using <<single-node-discovery,single-node discovery>>. Thus, we consider
  29. an Elasticsearch node to be in development mode if it can not form a cluster
  30. with another machine via a non-loopback address, and is otherwise in production
  31. mode if it can join a cluster via non-loopback addresses.
  32. Note that HTTP and transport can be configured independently via
  33. <<http-settings,`http.host`>> and <<transport-settings,`transport.host`>>; this
  34. can be useful for configuring a single node to be reachable via HTTP for testing
  35. purposes without triggering production mode.
  36. [[single-node-discovery]]
  37. [discrete]
  38. === Single-node discovery
  39. We recognize that some users need to bind transport to an external interface for
  40. testing their usage of the transport client. For this situation, we provide the
  41. discovery type `single-node` (configure it by setting `discovery.type` to
  42. `single-node`); in this situation, a node will elect itself master and will not
  43. join a cluster with any other node.
  44. [discrete]
  45. === Forcing the bootstrap checks
  46. If you are running a single node in production, it is possible to evade the
  47. bootstrap checks (either by not binding transport to an external interface, or
  48. by binding transport to an external interface and setting the discovery type to
  49. `single-node`). For this situation, you can force execution of the bootstrap
  50. checks by setting the system property `es.enforce.bootstrap.checks` to `true`
  51. (set this in <<jvm-options>>, or by adding `-Des.enforce.bootstrap.checks=true`
  52. to the environment variable `ES_JAVA_OPTS`). We strongly encourage you to do
  53. this if you are in this specific situation. This system property can be used to
  54. force execution of the bootstrap checks independent of the node configuration.
  55. === Heap size check
  56. By default, {es} automatically sizes JVM heap based on a node's
  57. <<node-roles,roles>> and total memory. If you manually override the default
  58. sizing and start the JVM with different initial and max heap sizes, the JVM may
  59. pause as it resizes the heap during system usage. If you enable
  60. <<bootstrap-memory_lock,`bootstrap.memory_lock`>>, the JVM locks the initial heap
  61. size on startup. If the initial heap size is not equal to the maximum heap size,
  62. some JVM heap may not be locked after a resize. To avoid these issues, start the
  63. JVM with an initial heap size equal to the maximum heap size.
  64. === File descriptor check
  65. File descriptors are a Unix construct for tracking open "files". In Unix
  66. though, {wikipedia}/Everything_is_a_file[everything is
  67. a file]. For example, "files" could be a physical file, a virtual file
  68. (e.g., `/proc/loadavg`), or network sockets. Elasticsearch requires
  69. lots of file descriptors (e.g., every shard is composed of multiple
  70. segments and other files, plus connections to other nodes, etc.). This
  71. bootstrap check is enforced on OS X and Linux. To pass the file
  72. descriptor check, you might have to configure <<file-descriptors,file
  73. descriptors>>.
  74. === Memory lock check
  75. When the JVM does a major garbage collection it touches every page of
  76. the heap. If any of those pages are swapped out to disk they will have
  77. to be swapped back in to memory. That causes lots of disk thrashing that
  78. Elasticsearch would much rather use to service requests. There are
  79. several ways to configure a system to disallow swapping. One way is by
  80. requesting the JVM to lock the heap in memory through `mlockall` (Unix)
  81. or virtual lock (Windows). This is done via the Elasticsearch setting
  82. <<bootstrap-memory_lock,`bootstrap.memory_lock`>>. However, there are
  83. cases where this setting can be passed to Elasticsearch but
  84. Elasticsearch is not able to lock the heap (e.g., if the `elasticsearch`
  85. user does not have `memlock unlimited`). The memory lock check verifies
  86. that *if* the `bootstrap.memory_lock` setting is enabled, that the JVM
  87. was successfully able to lock the heap. To pass the memory lock check,
  88. you might have to configure <<bootstrap-memory_lock,`bootstrap.memory_lock`>>.
  89. [[max-number-threads-check]]
  90. === Maximum number of threads check
  91. Elasticsearch executes requests by breaking the request down into stages
  92. and handing those stages off to different thread pool executors. There
  93. are different <<modules-threadpool,thread pool executors>> for a variety
  94. of tasks within Elasticsearch. Thus, Elasticsearch needs the ability to
  95. create a lot of threads. The maximum number of threads check ensures
  96. that the Elasticsearch process has the rights to create enough threads
  97. under normal use. This check is enforced only on Linux. If you are on
  98. Linux, to pass the maximum number of threads check, you must configure
  99. your system to allow the Elasticsearch process the ability to create at
  100. least 4096 threads. This can be done via `/etc/security/limits.conf`
  101. using the `nproc` setting (note that you might have to increase the
  102. limits for the `root` user too).
  103. === Max file size check
  104. The segment files that are the components of individual shards and the translog
  105. generations that are components of the translog can get large (exceeding
  106. multiple gigabytes). On systems where the max size of files that can be created
  107. by the Elasticsearch process is limited, this can lead to failed
  108. writes. Therefore, the safest option here is that the max file size is unlimited
  109. and that is what the max file size bootstrap check enforces. To pass the max
  110. file check, you must configure your system to allow the Elasticsearch process
  111. the ability to write files of unlimited size. This can be done via
  112. `/etc/security/limits.conf` using the `fsize` setting to `unlimited` (note that
  113. you might have to increase the limits for the `root` user too).
  114. [[max-size-virtual-memory-check]]
  115. === Maximum size virtual memory check
  116. Elasticsearch and Lucene use `mmap` to great effect to map portions of
  117. an index into the Elasticsearch address space. This keeps certain index
  118. data off the JVM heap but in memory for blazing fast access. For this to
  119. be effective, the Elasticsearch should have unlimited address space. The
  120. maximum size virtual memory check enforces that the Elasticsearch
  121. process has unlimited address space and is enforced only on Linux. To
  122. pass the maximum size virtual memory check, you must configure your
  123. system to allow the Elasticsearch process the ability to have unlimited
  124. address space. This can be done via adding `<user> - as unlimited`
  125. to `/etc/security/limits.conf`. This may require you to increase the limits
  126. for the `root` user too.
  127. === Maximum map count check
  128. Continuing from the previous <<max-size-virtual-memory-check,point>>, to
  129. use `mmap` effectively, Elasticsearch also requires the ability to
  130. create many memory-mapped areas. The maximum map count check checks that
  131. the kernel allows a process to have at least 262,144 memory-mapped areas
  132. and is enforced on Linux only. To pass the maximum map count check, you
  133. must configure `vm.max_map_count` via `sysctl` to be at least `262144`.
  134. Alternatively, the maximum map count check is only needed if you are using
  135. `mmapfs` or `hybridfs` as the <<index-modules-store,store type>> for your
  136. indices. If you <<allow-mmap,do not allow>> the use of `mmap` then this
  137. bootstrap check will not be enforced.
  138. === Client JVM check
  139. There are two different JVMs provided by OpenJDK-derived JVMs: the
  140. client JVM and the server JVM. These JVMs use different compilers for
  141. producing executable machine code from Java bytecode. The client JVM is
  142. tuned for startup time and memory footprint while the server JVM is
  143. tuned for maximizing performance. The difference in performance between
  144. the two VMs can be substantial. The client JVM check ensures that
  145. Elasticsearch is not running inside the client JVM. To pass the client
  146. JVM check, you must start Elasticsearch with the server VM. On modern
  147. systems and operating systems, the server VM is the
  148. default.
  149. === Use serial collector check
  150. There are various garbage collectors for the OpenJDK-derived JVMs
  151. targeting different workloads. The serial collector in particular is
  152. best suited for single logical CPU machines or extremely small heaps,
  153. neither of which are suitable for running Elasticsearch. Using the
  154. serial collector with Elasticsearch can be devastating for performance.
  155. The serial collector check ensures that Elasticsearch is not configured
  156. to run with the serial collector. To pass the serial collector check,
  157. you must not start Elasticsearch with the serial collector (whether it's
  158. from the defaults for the JVM that you're using, or you've explicitly
  159. specified it with `-XX:+UseSerialGC`). Note that the default JVM
  160. configuration that ships with Elasticsearch configures Elasticsearch to
  161. use the G1GC garbage collector with JDK14 and later versions. For earlier
  162. JDK versions, the configuration defaults to the CMS collector.
  163. === System call filter check
  164. Elasticsearch installs system call filters of various flavors depending
  165. on the operating system (e.g., seccomp on Linux). These system call
  166. filters are installed to prevent the ability to execute system calls
  167. related to forking as a defense mechanism against arbitrary code
  168. execution attacks on Elasticsearch. The system call filter check ensures
  169. that if system call filters are enabled, then they were successfully
  170. installed. To pass the system call filter check you must either fix any
  171. configuration errors on your system that prevented system call filters
  172. from installing (check your logs), or *at your own risk* disable system
  173. call filters by setting `bootstrap.system_call_filter` to `false`.
  174. === OnError and OnOutOfMemoryError checks
  175. The JVM options `OnError` and `OnOutOfMemoryError` enable executing
  176. arbitrary commands if the JVM encounters a fatal error (`OnError`) or an
  177. `OutOfMemoryError` (`OnOutOfMemoryError`). However, by default,
  178. Elasticsearch system call filters (seccomp) are enabled and these
  179. filters prevent forking. Thus, using `OnError` or `OnOutOfMemoryError`
  180. and system call filters are incompatible. The `OnError` and
  181. `OnOutOfMemoryError` checks prevent Elasticsearch from starting if
  182. either of these JVM options are used and system call filters are
  183. enabled. This check is always enforced. To pass this check do not enable
  184. `OnError` nor `OnOutOfMemoryError`; instead, upgrade to Java 8u92 and
  185. use the JVM flag `ExitOnOutOfMemoryError`. While this does not have the
  186. full capabilities of `OnError` nor `OnOutOfMemoryError`, arbitrary
  187. forking will not be supported with seccomp enabled.
  188. === Early-access check
  189. The OpenJDK project provides early-access snapshots of upcoming releases. These
  190. releases are not suitable for production. The early-access check detects these
  191. early-access snapshots. To pass this check, you must start Elasticsearch on a
  192. release build of the JVM.
  193. === G1GC check
  194. Early versions of the HotSpot JVM that shipped with JDK 8 are known to
  195. have issues that can lead to index corruption when the G1GC collector is
  196. enabled. The versions impacted are those earlier than the version of
  197. HotSpot that shipped with JDK 8u40. The G1GC check detects these early
  198. versions of the HotSpot JVM.
  199. === All permission check
  200. The all permission check ensures that the security policy used during bootstrap
  201. does not grant the `java.security.AllPermission` to Elasticsearch. Running with
  202. the all permission granted is equivalent to disabling the security manager.
  203. === Discovery configuration check
  204. By default, when Elasticsearch first starts up it will try and discover other
  205. nodes running on the same host. If no elected master can be discovered within a
  206. few seconds then Elasticsearch will form a cluster that includes any other
  207. nodes that were discovered. It is useful to be able to form this cluster
  208. without any extra configuration in development mode, but this is unsuitable for
  209. production because it's possible to form multiple clusters and lose data as a
  210. result.
  211. This bootstrap check ensures that discovery is not running with the default
  212. configuration. It can be satisfied by setting at least one of the following
  213. properties:
  214. - `discovery.seed_hosts`
  215. - `discovery.seed_providers`
  216. - `cluster.initial_master_nodes`