Skip to content

ClientRoutesIT.should_survive_full_node_replacement_through_nlb fails with Address already in use #851

@dkropachev

Description

@dkropachev

Description

ClientRoutesIT.should_survive_full_node_replacement_through_nlb consistently fails at line 650 (nlb.addNode(2)) with:

java.net.BindException: Address already in use (Bind failed)

Root Cause

NlbSimulator.rebuildDiscoveryProxy() creates a new RoundRobinProxy on the discovery port before closing the old one:

// NlbSimulator.java:166-170
discoveryProxy = new RoundRobinProxy(bindAddress, basePort, targets); // binds NEW socket
if (oldProxy != null) {
    oldProxy.close(); // closes OLD socket — too late if bind already failed
}

When addNode(2) is called, it first creates the per-node TcpProxy on port basePort+2 (succeeds), then calls rebuildDiscoveryProxy() which tries to bind a new server socket to basePort while the old proxy still holds that port. Even with SO_REUSEADDR, this can fail on Linux when two server sockets attempt to bind the same address:port simultaneously.

Fix

Close the old proxy before creating the new one. The brief window where no discovery proxy exists is acceptable in test infrastructure — no production traffic relies on uninterrupted availability of the test proxy.

Affected Tests

  • ClientRoutesIT.should_survive_full_node_replacement_through_nlb
  • Reproduced on branches: enable-shard-awareness-privatelink, fix/control-connection-resolve-endpoint

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions