9 years ago · 82713bab6d
--- a/docs/reference/setup.asciidoc
+++ b/docs/reference/setup.asciidoc
@@ -42,6 +42,8 @@ include::setup/configuration.asciidoc[]
 
				 
			
 
				 include::setup/important-settings.asciidoc[]
			
 
				 
			
 
				+include::setup/bootstrap-checks.asciidoc[]
			
 
				+
			
 
				 include::setup/sysconfig.asciidoc[]
			
 
				 
			
 
				 include::setup/upgrade.asciidoc[]
			
--- a/docs/reference/setup/bootstrap-checks.asciidoc
+++ b/docs/reference/setup/bootstrap-checks.asciidoc
@@ -0,0 +1,147 @@
 
				+[[bootstrap-checks]] == Bootstrap Checks
			
 
				+
			
 
				+Collectively, we have a lot of experience with users suffering
			
 
				+unexpected issues because they have not configured
			
 
				+<<important-setting,important settings>>. In previous versions of
			
 
				+Elasticsearch, misconfiguration of some of these settings were logged
			
 
				+as warnings. Understandably, users sometimes miss these log messages.
			
 
				+To ensure that these settings receive the attention that they deserve,
			
 
				+Elasticsearch has bootstrap checks upon startup.
			
 
				+
			
 
				+These bootstrap checks inspect a variety of Elasticsearch and system
			
 
				+settings and compare them to values that are safe for the operation of
			
 
				+Elasticsearch. If Elasticsearch is in development mode, any bootstrap
			
 
				+checks that fail appear as warnings in the Elasticsearch log. If
			
 
				+Elasticsearch is in production mode, any bootstrap checks that fail will
			
 
				+cause Elasticsearch to refuse to start.
			
 
				+
			
 
				+=== Development vs. production mode
			
 
				+
			
 
				+By default, Elasticsearch binds and publishes to `localhost`. This is
			
 
				+fine for downloading and playing with Elasticsearch, and everyday
			
 
				+development but it's useless for production systems. For a production
			
 
				+installation to be reachable, it must either bind or publish to an
			
 
				+external interface. Thus, we consider Elasticsearch to be in development
			
 
				+mode if it does not bind nor publish to an external interface (the
			
 
				+default), and is otherwise in production mode if it does bind or publish
			
 
				+to an external interface.
			
 
				+
			
 
				+=== Heap size check
			
 
				+
			
 
				+If a JVM is started with unequal initial and max heap size, it can be
			
 
				+prone to pauses as the JVM heap is resized during system usage. To avoid
			
 
				+these resize pauses, it's best to start the JVM with the initial heap
			
 
				+size equal to the maximum heap size. Additionally, if
			
 
				+<<bootstrap.mlockall>> is enabled, the JVM will lock the initial size of
			
 
				+the heap on startup. If the initial heap size is not equal to the
			
 
				+maximum heap size, after a resize it will not be the case that all of
			
 
				+the JVM heap is locked in memory. To pass the heap size check, you must
			
 
				+configure the [[heap-size,heap size]].
			
 
				+
			
 
				+=== File descriptor check
			
 
				+
			
 
				+File descriptors are a Unix construct for tracking open "files". In Unix
			
 
				+though, [everything is a
			
 
				+file](https://en.wikipedia.org/wiki/Everything_is_a_file). For example,
			
 
				+"files" could be a physical file, a virtual file (e.g.,
			
 
				+``/proc/loadavg`), or network sockets. Elasticsearch requires lots file
			
 
				+descriptors (e.g., every shard is composed of multiple segments and
			
 
				+other files, plus connections to other nodes, etc.). This bootstrap
			
 
				+check is enforced on OS X and Linux. To pass the file descriptor check,
			
 
				+you might have to configure [[file-descriptors,file descriptors]].
			
 
				+
			
 
				+=== Memory lock check
			
 
				+
			
 
				+When the JVM does a major garbage collection it touches every page of
			
 
				+the heap. If any of those pages are swapped out to disk they will have
			
 
				+to be swapped back in to memory. That causes lots of disk thrashing that
			
 
				+Elasticsearch would much rather use to service requests. There are
			
 
				+several ways to configure a system to disallow swapping. One way is by
			
 
				+requesting the JVM to lock the heap in memory through `mlockall` (Unix)
			
 
				+or virtual lock (Windows). This is done via the Elasticsearch setting
			
 
				+<<bootstrap.mlockall,`bootstrap.mlockall`>>. However, there are cases
			
 
				+where this setting can be passed to Elasticsearch but Elasticsearch is
			
 
				+not able to lock the heap (e.g., if the `elasticsearch` user does not
			
 
				+have `memlock unlimited`). The memory lock check verifies that *if* the
			
 
				+`bootstrap.mlockall` setting is enabled, that the JVM was successfully
			
 
				+able to lock the heap. To pass the memory lock check, you might have to
			
 
				+configure <<mlockall,`mlockall`>>.
			
 
				+
			
 
				+=== Minimum master nodes check
			
 
				+
			
 
				+Elasticsearch uses a single master for managing cluster state but
			
 
				+enables there to be multiple master-eligible nodes for
			
 
				+high-availability. In the case of a partition, master-eligible nodes on
			
 
				+each side of the partition might be elected as the acting master without
			
 
				+knowing that there is a master on the side of the partition. This can
			
 
				+lead to divergent cluster states potentially leading to data loss when
			
 
				+the partition is healed. This is the notion of a split brain and it is
			
 
				+the worst thing that can happen to an Elasticsearch cluster. But by
			
 
				+configuring <<minimum_master_nodes>> to be equal to a quorum of
			
 
				+master-eligible nodes, it is not possible for the cluster to suffer from
			
 
				+split brain because during a network partition there can be at most one
			
 
				+side of the partition that contains a quorum of master nodes. The
			
 
				+minimum master nodes check enforces that you've set
			
 
				+<<minimum_master_nodes>>. To pass the minimum master nodes check, you
			
 
				+must configure <<minimum_master_nodes>>.
			
 
				+
			
 
				+NOTE: The minimum master nodes check does not enforce that you've
			
 
				+configured <<minimum_master_nodes>> correctly, only that you have it
			
 
				+configured. Elasticsearch does log a warning message if it detects that
			
 
				+<<minimum_master_nodes>> is incorrectly configured based on the number
			
 
				+of master-eligible nodes visible in the cluster state. Future versions
			
 
				+of Elasticsearch will contain stricter enforcement of
			
 
				+<<minimum_master_nodes>>.
			
 
				+
			
 
				+=== Maximum number of threads check
			
 
				+
			
 
				+Elasticsearch executes requests by breaking the request down into stages
			
 
				+and handing those stages off to different thread pool executors. There
			
 
				+are different <<modules-threadpool,thread pool executors>> for a variety
			
 
				+of tasks within Elasticsearch. Thus, Elasticsearch needs the ability to
			
 
				+create a lot of threads. The maximum number of threads check ensures
			
 
				+that the Elasticsearch process has the rights to create enough threads
			
 
				+under normal use. This check is enforced only on Linux. If you are on
			
 
				+Linux, to pass the maximum number of threads check, you must configure
			
 
				+your system to allow the Elasticsearch process the ability to create at
			
 
				+least 2048 threads. This can be done via `/etc/security/limits.conf`
			
 
				+using the `nproc` setting (note that you might have to increase the
			
 
				+limits for the `root` user too).
			
 
				+
			
 
				+[[max-size-virtual-memory-check]]
			
 
				+=== Maximum size virtual memory check
			
 
				+
			
 
				+Elasticsearch and Lucene use `mmap` to great effect to map portions of
			
 
				+an index into the Elasticsearch address space. This keeps certain index
			
 
				+data off the JVM heap but in memory for blazing fast access. For this to
			
 
				+be effective, the Elasticsearch should have unlimited address space. The
			
 
				+maximum size virtual memory check enforces that the Elasticsearch
			
 
				+process has unlimited address space and is enforced only on Linux. To
			
 
				+pass the maximum size virtual memory check, you must configure your
			
 
				+system to allow the Elasticsearch process the ability to have unlimited
			
 
				+address space. This can be done via `/etc/security/limits.conf` using
			
 
				+the `as` setting to `unlimited` (note that you might have to increaes
			
 
				+the limits for the `root` user too).
			
 
				+
			
 
				+=== Maximum map count check
			
 
				+
			
 
				+Continuing from the previous <<max-size-virtual-memory-check,point>>, to
			
 
				+use `mmap` effectively, Elasticsearch also requires the ability to
			
 
				+create many memory-mapped areas. The maximum map count check checks that
			
 
				+the kernel allows a process to have at least 262,144 memory-mapped areas
			
 
				+and is enforced on Linux only. To pass the maximum map count check, you
			
 
				+must configure `vm.max_map_count` via `sysctl` to be at least `262144`.
			
 
				+
			
 
				+=== Client JVM check
			
 
				+
			
 
				+There are two different JVMs provided by OpenJDK-derived JVMs: the
			
 
				+client JVM and the server JVM. These JVMs use different compilers for
			
 
				+producing executable machine code from Java bytecode. The client JVM is
			
 
				+tuned for startup time and memory footprint while the server JVM is
			
 
				+tuned for maximizing performance. The difference in performance between
			
 
				+the two VMs can be substantial. The client JVM check ensures that
			
 
				+Elasticsearch is not running inside the client JVM. To pass the client
			
 
				+JVM check, you must start Elasticsearch with the server VM. On modern
			
 
				+systems and operating systems, the server VM is the
			
 
				+default. Additionally, Elasticsearch is configured by default to force
			
 
				+the server VM.