Browse Source

[DOCS] Warn about impact of large readahead on search (#88007)

When using LVM or software raid on Linux the kernel, or specific
distribution rules, may use higher ergonomic defaults for the
readahead of resulting block device(s). This can adversely affect
search performance due to high page cache thrashing, in search
heavy scenarios when mmap is involved.

Add a clarification section in the docs raising awareness about this
value and preferring the lower default.
Dimitrios Liappis 3 years ago
parent
commit
5056b666de
2 changed files with 31 additions and 0 deletions
  1. 5 0
      docs/changelog/88007.yaml
  2. 26 0
      docs/reference/how-to/search-speed.asciidoc

+ 5 - 0
docs/changelog/88007.yaml

@@ -0,0 +1,5 @@
+pr: 88007
+summary: Warn about impact of large readahead on search
+area: Performance
+type: enhancement
+issues: []

+ 26 - 0
docs/reference/how-to/search-speed.asciidoc

@@ -9,6 +9,32 @@ fast. In general, you should make sure that at least half the available memory
 goes to the filesystem cache so that Elasticsearch can keep hot regions of the
 index in physical memory.
 
+[discrete]
+=== Avoid page cache thrashing by using modest readahead values on Linux
+
+Search can cause a lot of randomized read I/O. When the underlying block
+device has a high readahead value, there may be a lot of unnecessary
+read I/O done, especially when files are accessed using memory mapping
+(see <<file-system,storage types>>).
+
+Most Linux distributions use a sensible readahead value of `128KiB` for a
+single plain device, however, when using software raid, LVM or dm-crypt the
+resulting block device (backing Elasticsearch <<path-settings,path.data>>)
+may end up having a very large readahead value (in the range of several MiB).
+This usually results in severe page (filesystem) cache thrashing adversely
+affecting search (or <<docs,update>>) performance.
+
+You can check the current value in `KiB` using
+`lsblk -o NAME,RA,MOUNTPOINT,TYPE,SIZE`.
+Consult the documentation of your distribution on how to alter this value
+(for example with a `udev` rule to persist across reboots, or via
+https://man7.org/linux/man-pages/man8/blockdev.8.html[blockdev --setra]
+as a transient setting). We recommend a value of `128KiB` for readahead.
+
+WARNING: `blockdev` expects values in 512 byte sectors whereas `lsblk` reports
+values in `KiB`. As an example, to temporarily set readahead to `128KiB`
+for `/dev/nvme0n1`, specify `blockdev --setra 256 /dev/nvme0n1`.
+
 [discrete]
 === Use faster hardware