Skip to content

test: remove fixed waits from trivial ERS e2e#97

Draft
cursor[bot] wants to merge 1 commit intomainfrom
cursor/vitess-ci-test-performance-d7f4
Draft

test: remove fixed waits from trivial ERS e2e#97
cursor[bot] wants to merge 1 commit intomainfrom
cursor/vitess-ci-test-performance-d7f4

Conversation

@cursor
Copy link

@cursor cursor bot commented Mar 18, 2026

Description

This PR speeds up one of the slowest tests in the slowest meaningful cluster_endtoend shard on main by removing fixed sleeps from go/test/endtoend/reparent/emergencyreparent.TestTrivialERS.

While reviewing the latest successful upstream CI runs with gh --repo vitessio/vitess, the latest meaningful cluster_endtoend success was run 23057103687, where Run endtoend tests on Cluster (ers_prs_newfeatures_heavy) was the slowest job at about 24m17s. Within that shard, TestTrivialERS was one of the biggest individual contributors at about 62.97s. The test was paying a fixed 40s tax from eight unconditional time.Sleep(5 * time.Second) calls after ERS operations.

This change replaces those fixed waits with state-based validation that waits only until the cluster is actually healthy again:

  • validate the topology after each ERS invocation
  • identify the current primary dynamically
  • verify there is exactly one primary and the remaining tablets are replicas
  • confirm replication still works by writing through the current primary and checking the replicas

That preserves the test's intent while avoiding hard-coded idle time.

Related Issue(s)

  • None

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

No deployment changes.

Validation

Local targeted e2e validation:

source build.env
rm -rf "$VTDATAROOT"/vtroot_*
go test -count=1 -timeout 20m -run '^TestTrivialERS$' vitess.io/vitess/go/test/endtoend/reparent/emergencyreparent

Result after this change:

  • ok vitess.io/vitess/go/test/endtoend/reparent/emergencyreparent 30.230s

For comparison, the latest meaningful upstream CI log showed this test at about 62.97s, so the targeted runtime was cut roughly in half locally.

AI Disclosure

This PR was authored by GPT-5 with local validation and CI log analysis.

Open in Web View Automation 

Signed-off-by: Cursor Agent <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant