Skip to content

add BWS stall detection and step-level debug logging#10045

Open
macfarla wants to merge 2 commits intobesu-eth:mainfrom
macfarla:improve/bws-stall-logging
Open

add BWS stall detection and step-level debug logging#10045
macfarla wants to merge 2 commits intobesu-eth:mainfrom
macfarla:improve/bws-stall-logging

Conversation

@macfarla
Copy link
Contributor

  • BackwardSyncContext: log when reusing an existing session (vs starting a new one) so stuck sessions are immediately visible
  • BackwardSyncContext: emit WARN if a session has made no progress for more than 5 minutes, including session age, time since last progress, and future.isDone() to confirm it is stuck
  • BackwardSyncContext: record last-progress timestamp on each block import so the stall timer resets as work proceeds
  • BackwardSyncContext: log attempt number on each retry so retry exhaustion is visible at DEBUG
  • BackwardSyncAlgorithm: add DEBUG log for each pickNextStep() branch (hash processing, backward fetch, forward sync, known ancestors) so the algorithm's path through each cycle is traceable

Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

  • spotless: ./gradlew spotlessApply
  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests
  • hive tests: Engine or other RPCs modified?

- BackwardSyncContext: log when reusing an existing session (vs
  starting a new one) so stuck sessions are immediately visible
- BackwardSyncContext: emit WARN if a session has made no progress
  for more than 5 minutes, including session age, time since last
  progress, and future.isDone() to confirm it is stuck
- BackwardSyncContext: record last-progress timestamp on each block
  import so the stall timer resets as work proceeds
- BackwardSyncContext: log attempt number on each retry so retry
  exhaustion is visible at DEBUG
- BackwardSyncAlgorithm: add DEBUG log for each pickNextStep()
  branch (hash processing, backward fetch, forward sync, known
  ancestors) so the algorithm's path through each cycle is traceable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
@macfarla macfarla added dev experience The build system, things that enable easier development etc. syncing labels Mar 13, 2026
Copilot AI review requested due to automatic review settings March 13, 2026 23:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds stall detection and debug logging to the backward sync subsystem to improve observability of stuck sessions and algorithm step decisions.

Changes:

  • Add stall detection warning when a backward sync session makes no progress for 5+ minutes
  • Add debug logging for session reuse, retry attempts, and each pickNextStep() branch
  • Track last-progress timestamp on block imports to reset the stall timer

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
BackwardSyncContext.java Stall detection, progress tracking, session reuse logging, retry attempt logging
BackwardSyncAlgorithm.java Debug logging for each pickNextStep() decision branch

You can also share your feedback on Copilot code review. Take the survey.

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dev experience The build system, things that enable easier development etc. syncing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants