-
Notifications
You must be signed in to change notification settings - Fork 4.6k
xds/cdsbalancer: change TestErrorFromParentLB_ResourceNotFound to send correct resources #8794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds/cdsbalancer: change TestErrorFromParentLB_ResourceNotFound to send correct resources #8794
Conversation
2838498 to
d83dd4b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #8794 +/- ##
==========================================
- Coverage 83.42% 83.36% -0.06%
==========================================
Files 418 417 -1
Lines 32897 32978 +81
==========================================
+ Hits 27443 27491 +48
- Misses 4069 4084 +15
- Partials 1385 1403 +18 🚀 New features to boost your workflow:
|
|
Please add a more descriptive title and description. This is what we recommend for our external contributors: https://github.com/grpc/grpc-go/blob/master/CONTRIBUTING.md#pr-descriptions |
|
|
||
| // Replace DNS resolver with a wrapped resolver to capture ResolveNow calls. | ||
| resolveNowCh := make(chan struct{}, 1) | ||
| resolveNowCh := make(chan struct{}, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this change? How is a reviewer supposed to understand the motivation for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a flaky test which I thought was because of my changes but this is an already flaky test which pranjali is working on fixing , so reverting the change now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flake reported in this issue is caused by the changes made for logical-dns.
Fixed this in adb4625.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, the root cause described by @Pranjali-2501 was only in her fork, not on the master branch.
If this is correct, @eshitachandwani please file an issue with the logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have already reverted the change. Maybe you are looking at the wrong commit.
|
|
||
| const ( | ||
| defaultTestTimeout = 10 * time.Second | ||
| defaultTestTimeout = 20 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Why do these tests need 20s to run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier, with my changes , the test was taking around 15 secs to pass, because the EDS update was taking a tittle longer to be received.
But that doesn't seem to be the case anymore after a few other changes, so reverting the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests usually finish in < 100 milliseconds. There are some tests that make 1000s of RPCs to verify RPC distribution which take 1-5s. A test taking 10+ seconds usually means that there's probably a deadlock somewhere which gets resolved when teardown begins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! Maybe there was some issue with my changes that were making it fail. Which I have fixed and I have already reverted this change.
+1. Also, can you describe what caused the flakiness and how does this PR fix it? I suspect it was a recent change because I saw the same test flake for the first time and filed an issue: #8803 |
|
This change does not fix the flakiness , or rather that was not what I had in mind. While looking at the test I found that it configured one set of resources , then removes CDS,EDS and then instead of adding it back again , it just configures a whole new set of LDS,RDS,CDS,EDS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please improve the PR title. Using a short but descriptive title makes things easier for people browsing through the commit history.
This PR fixes the
TestErrorFromParentLB_ResourceNotFoundto reconfigure the same CDS and EDS resources after they have been removed instead of configuring all the new resources.Also removes unused
defaultTestShortTimeoutfrom internal/xds/balancer/clustermanager/clustermanager_test.goRELEASE NOTES: None