Browse Source

Mention the cost of tracking live docs in scrolls (#41375)

Relates #41337, in which a heap dump shows hundreds of MBs allocated on the
heap for tracking the live docs for each scroll.
David Turner 6 years ago
parent
commit
b47d65c917
1 changed files with 21 additions and 11 deletions
  1. 21 11
      docs/reference/search/request/scroll.asciidoc

+ 21 - 11
docs/reference/search/request/scroll.asciidoc

@@ -103,6 +103,12 @@ GET /_search?scroll=1m
 [[scroll-search-context]]
 ==== Keeping the search context alive
 
+A scroll returns all the documents which matched the search at the time of the
+initial search request. It ignores any subsequent changes to these documents.
+The `scroll_id` identifies a _search context_ which keeps track of everything
+that {es} needs to return the correct documents. The search context is created
+by the initial request and kept alive by subsequent requests.
+
 The `scroll` parameter (passed to the `search` request and to every `scroll`
 request) tells Elasticsearch how long it should keep the search context alive.
 Its value (e.g. `1m`, see <<time-units>>) does not need to be long enough to
@@ -112,17 +118,21 @@ new  expiry time. If a `scroll` request doesn't pass in the `scroll`
 parameter, then the search context will be freed as part of _that_ `scroll`
 request.
 
-Normally, the background merge process optimizes the
-index by merging together smaller segments to create new bigger segments, at
-which time the smaller segments are deleted. This process continues during
-scrolling, but an open search context prevents the old segments from being
-deleted while they are still in use.  This is how Elasticsearch is able to
-return the results of the initial search request, regardless of subsequent
-changes to documents.
-
-TIP: Keeping older segments alive means that more file handles are needed.
-Ensure that you have configured your nodes to have ample free file handles.
-See <<file-descriptors>>.
+Normally, the background merge process optimizes the index by merging together
+smaller segments to create new, bigger segments. Once the smaller segments are
+no longer needed they are deleted. This process continues during scrolling, but
+an open search context prevents the old segments from being deleted since they
+are still in use.
+
+TIP: Keeping older segments alive means that more disk space and file handles
+are needed. Ensure that you have configured your nodes to have ample free file
+handles. See <<file-descriptors>>.
+
+Additionally, if a segment contains deleted or updated documents then the
+search context must keep track of whether each document in the segment was live
+at the time of the initial search request. Ensure that your nodes have
+sufficient heap space if you have many open scrolls on an index that is subject
+to ongoing deletes or updates.
 
 NOTE: To prevent against issues caused by having too many scrolls open, the
 user is not allowed to open scrolls past a certain limit. By default, the