Skip to content

fix: resolve control node endpoint for ClientRoutes contact point matching#843

Draft
dkropachev wants to merge 1 commit intoscylladb:scylla-4.xfrom
dkropachev:fix/client-routes-contact-point-matching
Draft

fix: resolve control node endpoint for ClientRoutes contact point matching#843
dkropachev wants to merge 1 commit intoscylladb:scylla-4.xfrom
dkropachev:fix/client-routes-contact-point-matching

Conversation

@dkropachev
Copy link

@dkropachev dkropachev commented Mar 19, 2026

Summary

  • With ClientRoutes, contact points use DefaultEndPoint (discovery proxy) while discovered nodes use ClientRoutesEndPoint (keyed by host_id). These types never match via equals(), causing InitialNodeListRefresh to emit spurious remove+add events for the control node.
  • Added TopologyMonitor.getChannelNodeInfo(channel) API that queries system.local and returns the proper endpoint for the connected node
  • ControlConnection.connect() calls this after each successful connection and updates DriverChannel.setEndPoint() — for ClientRoutes this upgrades from DefaultEndPoint to ClientRoutesEndPoint
  • InitialNodeListRefresh matches the control node by endpoint equality against the (now-upgraded) channel endpoint, falling back to the first unmatched contact point
  • If the system.local query fails, the control connection retries the next node

Test plan

  • All unit tests pass (MetadataManagerTest, InitialNodeListRefreshTest, ClientRoutesTopologyMonitorTest, ControlConnectionTest — 84 tests)
  • ClientRoutesIT integration test against ScyllaDB Enterprise 2026.1

…tact point matching

With ClientRoutes, contact points use DefaultEndPoint (discovery proxy) while
discovered nodes use ClientRoutesEndPoint (keyed by host_id). These endpoint
types never match via equals(), causing InitialNodeListRefresh to fail to match
the contact point to the system.local node, resulting in a spurious remove+add
cycle.

Add TopologyMonitor.getChannelNodeInfo(channel) API that queries system.local
and returns the proper endpoint for the connected node. ControlConnection calls
this after each successful connect() and updates the channel's endpoint via
DriverChannel.setEndPoint(). For ClientRoutes, this upgrades the channel from
DefaultEndPoint to ClientRoutesEndPoint. InitialNodeListRefresh then matches
the control node by endpoint equality and falls back to the first unmatched
contact point.

If the system.local query fails, the control connection retries the next node.

Fixes: scylladb#841
@dkropachev dkropachev force-pushed the fix/client-routes-contact-point-matching branch from f934b03 to a040c57 Compare March 19, 2026 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant