Browse Source

Clarify rolling upgrade fallback to restart upgrade (#42161)

Adds a note that restarting half-or-more of the master-eligible nodes means
you're no longer doing a rolling upgrade, and may need to upgrade all the
things before the cluster returns to health.
David Turner 6 years ago
parent
commit
f3dbfdb444
1 changed files with 18 additions and 9 deletions
  1. 18 9
      docs/reference/upgrade/rolling_upgrade.asciidoc

+ 18 - 9
docs/reference/upgrade/rolling_upgrade.asciidoc

@@ -168,20 +168,29 @@ include::open-ml.asciidoc[]
 
 
 During a rolling upgrade, the cluster continues to operate normally. However,
 During a rolling upgrade, the cluster continues to operate normally. However,
 any new functionality is disabled or operates in a backward compatible mode
 any new functionality is disabled or operates in a backward compatible mode
-until all nodes in the cluster are upgraded. New functionality
-becomes operational once the upgrade is complete and all nodes are running the
-new version. Once that has happened, there's no way to return to operating
-in a backward compatible mode. Nodes running the previous major version will
-not be allowed to join the fully-updated cluster.
+until all nodes in the cluster are upgraded. New functionality becomes
+operational once the upgrade is complete and all nodes are running the new
+version. Once that has happened, there's no way to return to operating in a
+backward compatible mode. Nodes running the previous major version will not be
+allowed to join the fully-updated cluster.
 
 
 In the unlikely case of a network malfunction during the upgrade process that
 In the unlikely case of a network malfunction during the upgrade process that
-isolates all remaining old nodes from the cluster, you must take the
-old nodes offline and upgrade them to enable them to join the cluster.
+isolates all remaining old nodes from the cluster, you must take the old nodes
+offline and upgrade them to enable them to join the cluster.
+
+If you stop half or more of the master-eligible nodes all at once during the
+upgrade then the cluster will become unavailable, meaning that the upgrade is
+no longer a _rolling_ upgrade. If this happens, you should upgrade and restart
+all of the stopped master-eligible nodes to allow the cluster to form again, as
+if performing a <<restart-upgrade,full-cluster restart upgrade>>. It may also
+be necessary to upgrade all of the remaining old nodes before they can join the
+cluster after it re-forms.
 
 
 Similarly, if you run a testing/development environment with only one master
 Similarly, if you run a testing/development environment with only one master
 node, the master node should be upgraded last. Restarting a single master node
 node, the master node should be upgraded last. Restarting a single master node
 forces the cluster to be reformed. The new cluster will initially only have the
 forces the cluster to be reformed. The new cluster will initially only have the
 upgraded master node and will thus reject the older nodes when they re-join the
 upgraded master node and will thus reject the older nodes when they re-join the
-cluster. Nodes that have already been upgraded will successfully re-join the 
-upgraded master. 
+cluster. Nodes that have already been upgraded will successfully re-join the
+upgraded master.
+
 ====================================================
 ====================================================