tcpretries.asciidoc 3.3 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
  1. [[system-config-tcpretries]]
  2. === TCP retransmission timeout
  3. Each pair of {es} nodes communicates via a number of TCP connections which
  4. <<long-lived-connections,remain open>> until one of the nodes shuts down or
  5. communication between the nodes is disrupted by a failure in the underlying
  6. infrastructure.
  7. TCP provides reliable communication over occasionally unreliable networks by
  8. hiding temporary network disruptions from the communicating applications. Your
  9. operating system will retransmit any lost messages a number of times before
  10. informing the sender of any problem. {es} must wait while the retransmissions
  11. are happening and can only react once the operating system decides to give up.
  12. Users must therefore also wait for a sequence of retransmissions to complete.
  13. Most Linux distributions default to retransmitting any lost packets 15 times.
  14. Retransmissions back off exponentially, so these 15 retransmissions take over
  15. 900 seconds to complete. This means it takes Linux many minutes to detect a
  16. network partition or a failed node with this method. Windows defaults to just 5
  17. retransmissions which corresponds with a timeout of around 6 seconds.
  18. The Linux default allows for communication over networks that may experience
  19. very long periods of packet loss, but this default is excessive and even harmful
  20. on the high quality networks used by most {es} installations. When a cluster
  21. detects a node failure it reacts by reallocating lost shards, rerouting
  22. searches, and maybe electing a new master node. Highly available clusters must
  23. be able to detect node failures promptly, which can be achieved by reducing the
  24. permitted number of retransmissions. Connections to
  25. <<modules-remote-clusters,remote clusters>> should also prefer to detect
  26. failures much more quickly than the Linux default allows. Linux users should
  27. therefore reduce the maximum number of TCP retransmissions.
  28. You can decrease the maximum number of TCP retransmissions to `5` by running the
  29. following command as `root`. Five retransmissions corresponds with a timeout of
  30. around six seconds.
  31. [source,sh]
  32. -------------------------------------
  33. sysctl -w net.ipv4.tcp_retries2=5
  34. -------------------------------------
  35. To set this value permanently, update the `net.ipv4.tcp_retries2` setting in
  36. `/etc/sysctl.conf`. To verify after rebooting, run
  37. `sysctl net.ipv4.tcp_retries2`.
  38. IMPORTANT: This setting applies to all TCP connections and will affect the
  39. reliability of communication with systems other than {es} clusters too. If your
  40. clusters communicate with external systems over a low quality network then you
  41. may need to select a higher value for `net.ipv4.tcp_retries2`. For this reason,
  42. {es} does not adjust this setting automatically.
  43. ==== Related configuration
  44. {es} also implements its own internal health checks with timeouts that are much
  45. shorter than the default retransmission timeout on Linux. Since these are
  46. application-level health checks their timeouts must allow for application-level
  47. effects such as garbage collection pauses. You should not reduce any timeouts
  48. related to these application-level health checks.
  49. You must also ensure your network infrastructure does not interfere with the
  50. long-lived connections between nodes, <<long-lived-connections,even if those
  51. connections appear to be idle>>. Devices which drop connections when they reach
  52. a certain age are a common source of problems to {es} clusters, and must not be
  53. used.