Skip to content

check connection is active if we detect already connected#10041

Open
macfarla wants to merge 3 commits intobesu-eth:mainfrom
macfarla:peering-check-liveness-if-already-connected
Open

check connection is active if we detect already connected#10041
macfarla wants to merge 3 commits intobesu-eth:mainfrom
macfarla:peering-check-liveness-if-already-connected

Conversation

@macfarla
Copy link
Contributor

@macfarla macfarla commented Mar 13, 2026

PR description

  • Active peer in activeConnections: disconnect it and fall through to allow the new connection
  • In-progress incomplete connections (the alreadyConnectedOrConnecting else branch): still rejected as before — those are mid-handshake connections where we genuinely shouldn't have two in-flight
  • Everything else: unchanged

One note: calling existingPeer.disconnect() is asynchronous — it sends the DISCONNECT message and schedules the channel close. The new connection is allowed to proceed immediately after. If both connections briefly overlap in activeConnections, the later addPeerToEthPeers call at line 774 handles that case (it already has dedup logic there via the eviction listener at line 758-772). So there's no double-registration risk.

Fixed Issue(s)

fixes #10040

Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

  • spotless: ./gradlew spotlessApply
  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests
  • hive tests: Engine or other RPCs modified?

…ion is already connected

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Copilot AI review requested due to automatic review settings March 13, 2026 02:11
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates peer-connection gating to replace an already-registered active connection (by disconnecting it) when the same peer reconnects, addressing restart/race scenarios described in #10040.

Changes:

  • In EthPeers.gatePeerConnection, detect an existing active connection and disconnect it to allow the new connection to proceed.
  • Add/extend unit tests covering reconnection behavior and cleanup of the previous connection.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
ethereum/eth/src/main/java/org/hyperledger/besu/ethereum/eth/manager/EthPeers.java Disconnects an existing active peer entry (instead of rejecting) to allow a reconnect to progress.
ethereum/eth/src/test/java/org/hyperledger/besu/ethereum/eth/manager/EthPeersTest.java Adds tests intended to validate reconnect/cleanup behavior for active peers.

You can also share your feedback on Copilot code review. Take the survey.

// New connection should be allowed, not rejected with ALREADY_CONNECTED
assertThat(result).isEmpty();
// Stale existing connection should have been disconnected
assertThat(oldConnection.isDisconnected()).isTrue();
ethPeers.gatePeerConnection(firstConnection.getPeer(), false);

// Old connection must be disconnected regardless
assertThat(firstConnection.isDisconnected()).isTrue();
Comment on lines +512 to +515
public void shouldRejectConnectionWhenGenuinelyAlreadyConnected() {
// Verify that a peer that has already been added to activeConnections via the normal path
// and whose connection is live does NOT get accepted as a second concurrent connection.
// The new behaviour replaces the old entry, so gating returns empty (allowed).
assertThat(firstConnection.isDisconnected()).isFalse();

// Second connection attempt from the same peer
ethPeers.gatePeerConnection(firstConnection.getPeer(), false);
Comment on lines +444 to +447
// Peer is actively trying to connect again while we have an existing registration.
// This typically happens after a restart race where TCP state diverges between the two
// nodes. Disconnect the stale entry and accept the fresh connection.
LOG.debug("Replacing stale connection to peer {} with new connection", peer.getLoggableId());
final Bytes id = peer.getId();
if (alreadyConnectedOrConnecting(inbound, id)) {
final EthPeer existingPeer = activeConnections.get(id);
if (existingPeer != null && !existingPeer.isDisconnected()) {
// nodes. Disconnect the stale entry and accept the fresh connection.
LOG.debug("Replacing stale connection to peer {} with new connection", peer.getLoggableId());
existingPeer.disconnect(DisconnectReason.ALREADY_CONNECTED);
} else if (alreadyConnectedOrConnecting(inbound, id)) {
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Static peers get stuck in ALREADY_CONNECTED loop after restart race condition

3 participants