-
Notifications
You must be signed in to change notification settings - Fork 38
Description
Description
ClientRoutesIT.should_survive_full_node_replacement_through_nlb consistently fails at line 650 (nlb.addNode(2)) with:
java.net.BindException: Address already in use (Bind failed)
Root Cause
NlbSimulator.rebuildDiscoveryProxy() creates a new RoundRobinProxy on the discovery port before closing the old one:
// NlbSimulator.java:166-170
discoveryProxy = new RoundRobinProxy(bindAddress, basePort, targets); // binds NEW socket
if (oldProxy != null) {
oldProxy.close(); // closes OLD socket — too late if bind already failed
}When addNode(2) is called, it first creates the per-node TcpProxy on port basePort+2 (succeeds), then calls rebuildDiscoveryProxy() which tries to bind a new server socket to basePort while the old proxy still holds that port. Even with SO_REUSEADDR, this can fail on Linux when two server sockets attempt to bind the same address:port simultaneously.
Fix
Close the old proxy before creating the new one. The brief window where no discovery proxy exists is acceptable in test infrastructure — no production traffic relies on uninterrupted availability of the test proxy.
Affected Tests
ClientRoutesIT.should_survive_full_node_replacement_through_nlb- Reproduced on branches:
enable-shard-awareness-privatelink,fix/control-connection-resolve-endpoint