5 years ago · 961a85e1e6
--- a/docs/reference/docs/refresh.asciidoc
+++ b/docs/reference/docs/refresh.asciidoc
@@ -31,15 +31,15 @@ visible at some point after the request returns.
 
				 
			
 
				 [float]
			
 
				 ==== Choosing which setting to use
			
 
				-
			
 
				-Unless you have a good reason to wait for the change to become visible always
			
 
				-use `refresh=false`, or, because that is the default, just leave the `refresh`
			
 
				-parameter out of the URL. That is the simplest and fastest choice.
			
 
				+// tag::refresh-default[]
			
 
				+Unless you have a good reason to wait for the change to become visible, always
			
 
				+use `refresh=false` (the default setting). The simplest and fastest choice is to omit the `refresh` parameter from the URL.
			
 
				 
			
 
				 If you absolutely must have the changes made by a request visible synchronously
			
 
				-with the request then you must pick between putting more load on
			
 
				-Elasticsearch (`true`) and waiting longer for the response (`wait_for`). Here
			
 
				-are a few points that should inform that decision:
			
 
				+with the request, you must choose between putting more load on
			
 
				+Elasticsearch (`true`) and waiting longer for the response (`wait_for`).
			
 
				+// end::refresh-default[]
			
 
				+Here are a few points that should inform that decision:
			
 
				 
			
 
				 * The more changes being made to the index the more work `wait_for` saves
			
 
				 compared to `true`. In the case that the index is only changed once every
			
--- a/docs/reference/images/lucene-in-memory-buffer.png
+++ b/docs/reference/images/lucene-in-memory-buffer.png
--- a/docs/reference/images/lucene-written-not-committed.png
+++ b/docs/reference/images/lucene-written-not-committed.png
--- a/docs/reference/intro.asciidoc
+++ b/docs/reference/intro.asciidoc
@@ -9,9 +9,9 @@ the {stack}. {ls} and {beats} facilitate collecting, aggregating, and
 
				 enriching your data and storing it in {es}. {kib} enables you to
			
 
				 interactively explore, visualize, and share insights into your data and manage
			
 
				 and monitor the stack. {es} is where the indexing, search, and analysis
			
 
				-magic happen.
			
 
				+magic happens.
			
 
				 
			
 
				-{es} provides real-time search and analytics for all types of data. Whether you
			
 
				+{es} provides near real-time search and analytics for all types of data. Whether you
			
 
				 have structured or unstructured text, numerical data, or geospatial data,
			
 
				 {es} can efficiently store and index it in a way that supports fast searches.
			
 
				 You can go far beyond simple data retrieval and aggregate information to discover
			
@@ -46,8 +46,7 @@ as JSON documents. When you have multiple {es} nodes in a cluster, stored
 
				 documents are distributed across the cluster and can be accessed immediately
			
 
				 from any node.
			
 
				 
			
 
				-When a document is stored, it is indexed and fully searchable in near
			
 
				-real-time--within 1 second. {es} uses a data structure called an
			
 
				+When a document is stored, it is indexed and fully searchable in <<near-real-time,near real-time>>--within 1 second. {es} uses a data structure called an
			
 
				 inverted index that supports very fast full-text searches. An inverted index
			
 
				 lists every unique word that appears in any document and identifies all of the
			
 
				 documents each word occurs in.
			
--- a/docs/reference/search/index.asciidoc
+++ b/docs/reference/search/index.asciidoc
@@ -14,7 +14,7 @@ Depending on your data, you can use a query to get answers to questions like:
 
				 * What pages on my website contain a specific word or phrase?
			
 
				 * What processes on my server take longer than 500 milliseconds to respond?
			
 
				 * What users on my network ran `regsvr32.exe` within the last week?
			
 
				-* How many of my products have a price greater than $20? 
			
 
				+* How many of my products have a price greater than $20?
			
 
				 
			
 
				 A _search_ consists of one or more queries that are combined and sent to {es}.
			
 
				 Documents that match a search's queries are returned in the _hits_, or
			
@@ -29,11 +29,13 @@ a specific number of results.
 
				 === In this section
			
 
				 
			
 
				 * <<run-a-search>>
			
 
				+* <<near-real-time>>
			
 
				 * <<modules-cross-cluster-search>>
			
 
				 * <<async-search-intro>>
			
 
				 
			
 
				 --
			
 
				 
			
 
				 include::run-a-search.asciidoc[]
			
 
				+include::{es-repo-dir}/search/near-real-time.asciidoc[]
			
 
				 include::{es-repo-dir}/async-search.asciidoc[]
			
 
				-include::{es-repo-dir}/modules/cross-cluster-search.asciidoc[]
			
 
				+include::{es-repo-dir}/modules/cross-cluster-search.asciidoc[]
			
--- a/docs/reference/search/near-real-time.asciidoc
+++ b/docs/reference/search/near-real-time.asciidoc
@@ -0,0 +1,25 @@
 
				+[[near-real-time]]
			
 
				+== Near real-time search
			
 
				+The overview of <<documents-indices,documents and indices>> indicates that when a document is stored in {es}, it is indexed and fully searchable in _near real-time_--within 1 second. What defines near real-time search?
			
 
				+
			
 
				+Lucene, the Java libraries on which {es} is based, introduced the concept of per-segment search. A _segment_ is similar to an inverted index, but the word _index_ in Lucene means "a collection of segments plus a commit point". After a commit, a new segment is added to the commit point and the buffer is cleared.
			
 
				+
			
 
				+Sitting between {es} and the disk is the filesystem cache. Documents in the in-memory indexing buffer (<<img-pre-refresh,Figure 1>>) are written to a new segment (<<img-post-refresh,Figure 2>>). The new segment is written to the filesystem cache first (which is cheap) and only later is it flushed to disk (which is expensive). However, after a file is in the cache, it can be opened and read just like any other file.
			
 
				+
			
 
				+[[img-pre-refresh]]
			
 
				+.A Lucene index with new documents in the in-memory buffer
			
 
				+image::images/lucene-in-memory-buffer.png["A Lucene index with new documents in the in-memory buffer"]
			
 
				+
			
 
				+Lucene allows new segments to be written and opened, making the documents they contain visible to search without performing a full commit. This is a much lighter process than a commit to disk, and can be done frequently without degrading performance.
			
 
				+
			
 
				+[[img-post-refresh]]
			
 
				+.The buffer contents are written to a segment, which is searchable, but is not yet committed
			
 
				+image::images/lucene-written-not-committed.png["The buffer contents are written to a segment, which is searchable, but is not yet committed"]
			
 
				+
			
 
				+In {es}, this process of writing and opening a new segment is called a _refresh_. A refresh makes all operations performed on an index since the last refresh available for search. You can control refreshes through the following means:
			
 
				+
			
 
				+* Waiting for the refresh interval
			
 
				+* Setting the <<docs-refresh,?refresh>> option
			
 
				+* Using the <<indices-refresh,Refresh API>> to explicitly complete a refresh (`POST _refresh`)
			
 
				+
			
 
				+By default, {es} periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. This is why we say that {es} has _near_ real-time search: document changes are not visible to search immediately, but will become visible within this timeframe.