瀏覽代碼

Fix Cluster Stabilization in SnapshotResiliencyTests (#55159)

Just like in `AbstractCoordinatorTestCase` we can't just assume the cluster
is stable once all the cluster states align since stray follower/leader check
tasks could still hit us after a disconnect, causing future test operations to fail.
=> fixed by running all tasks in the possible time span of running into these
checks before validating that cluster states align on all nodes to prevent this
like we do in the coordinator tests.

Closes #55103
Armin Braun 5 年之前
父節點
當前提交
88f5149160
共有 1 個文件被更改,包括 6 次插入1 次删除
  1. 6 1
      server/src/test/java/org/elasticsearch/snapshots/SnapshotResiliencyTests.java

+ 6 - 1
server/src/test/java/org/elasticsearch/snapshots/SnapshotResiliencyTests.java

@@ -101,6 +101,7 @@ import org.elasticsearch.cluster.SnapshotsInProgress;
 import org.elasticsearch.cluster.action.index.MappingUpdatedAction;
 import org.elasticsearch.cluster.action.index.NodeMappingRefreshAction;
 import org.elasticsearch.cluster.action.shard.ShardStateAction;
+import org.elasticsearch.cluster.coordination.AbstractCoordinatorTestCase;
 import org.elasticsearch.cluster.coordination.ClusterBootstrapService;
 import org.elasticsearch.cluster.coordination.CoordinationMetadata.VotingConfiguration;
 import org.elasticsearch.cluster.coordination.CoordinationState;
@@ -408,7 +409,6 @@ public class SnapshotResiliencyTests extends ESTestCase {
         assertThat(snapshotIds, hasSize(1));
     }
 
-    @AwaitsFix(bugUrl = "https://github.com/elastic/elasticsearch/issues/55103")
     public void testSnapshotDeleteWithMasterFailover() {
         final int dataNodes = randomIntBetween(2, 10);
         final int masterNodes = randomFrom(3, 5);
@@ -966,6 +966,11 @@ public class SnapshotResiliencyTests extends ESTestCase {
     }
 
     private void stabilize() {
+        final long endTime = deterministicTaskQueue.getCurrentTimeMillis() + AbstractCoordinatorTestCase.DEFAULT_STABILISATION_TIME;
+        while (deterministicTaskQueue.getCurrentTimeMillis() < endTime) {
+            deterministicTaskQueue.advanceTime();
+            deterministicTaskQueue.runAllRunnableTasks();
+        }
         runUntil(
             () -> {
                 final Collection<ClusterState> clusterStates =