Browse Source

Don't delete local shard data when its allocated on a node that doesn't exists
This is an extreme case, exposed by a bug we had in our allocation in local gateway, causing a cluster state that doesn't include a node in the nodes list, but still has the shard in the routing table pointing at the non existent node. Then, when a node on the same box comes back, it will cause the local shard data to be deleted because it thinks its fully allocated on other nodes.
fixes #4502

Shay Banon 12 năm trước cách đây
mục cha
commit
f0356b2126

+ 16 - 2
src/main/java/org/elasticsearch/indices/store/IndicesStore.java

@@ -140,10 +140,24 @@ public class IndicesStore extends AbstractComponent implements ClusterStateListe
                             shardCanBeDeleted = false;
                             break;
                         }
-                        String localNodeId = clusterService.localNode().id();
+
+                        // if the allocated or relocation node id doesn't exists in the cluster state, its a stale
+                        // node, make sure we don't do anything with this until the routing table has properly been
+                        // rerouted to reflect the fact that the node does not exists
+                        if (!event.state().nodes().nodeExists(shardRouting.currentNodeId())) {
+                            shardCanBeDeleted = false;
+                            break;
+                        }
+                        if (shardRouting.relocatingNodeId() != null) {
+                            if (!event.state().nodes().nodeExists(shardRouting.relocatingNodeId())) {
+                                shardCanBeDeleted = false;
+                                break;
+                            }
+                        }
+
                         // check if shard is active on the current node or is getting relocated to the our node
+                        String localNodeId = clusterService.localNode().id();
                         if (localNodeId.equals(shardRouting.currentNodeId()) || localNodeId.equals(shardRouting.relocatingNodeId())) {
-                            // shard will be used locally - keep it
                             shardCanBeDeleted = false;
                             break;
                         }