Skip to content

op-supernode: Track FCU retry-verify workaround in RewindToTimestamp #19772

@wwared

Description

@wwared

Summary

RewindToTimestamp uses a retry-verify loop after each FCU to work around an op-reth race condition where two back-to-back forkchoice updates for sibling blocks at the same height cause a safe head mismatch panic (reth#23205).

This issue tracks whether we will keep this workaround long term or remove it after it's no longer an issue upstream.

Current implementation (PR #19773)

After each FCU, verifyRewindState checks that all 3 engine heads (unsafe, safe, finalized) match the expected values. If they haven't converged, the FCU is retried after a 500ms delay, up to 20 attempts (10s total).

In practice on op-reth, the verify query itself provides enough delay for reth to flush state — most rewinds need 0 retries, with occasional 1-retry cases observed.

Failure mode

If the retry loop exhausts all attempts, RewindToTimestamp returns ErrRewindFCUHeadMismatch. Without this workaround, the rewind step fails and the node FCUs to a synthetic block instead, which can lead to panic: superAuthority supplied an identifier for the safe head which is not known to the engine errors.

Additional follow-up

- Medium: /tmp/optimism-pr19773.LB4KGV/op-supernode/supernode/chain_container/engine_controller/
    rewind.go:205 retries on any verifyRewindState failure, not just the intended “heads haven’t
    converged yet” case. That means a plain L2BlockRefByLabel read failure gets retried 20 times and is
    finally reclassified as ErrRewindFCUHeadMismatch, even though the FCU itself may have succeeded.
    In /tmp/optimism-pr19773.LB4KGV/op-supernode/supernode/chain_container/chain_container.go:517, that
    wrapped error is treated as a temporary rewind error, so the caller will retry the whole rewind
    instead of surfacing the real EL/RPC failure. I’d split “label read failed” from “label hash
    mismatched” and only retry the latter.

Originally posted by @karlfloersch in #19773 (review)

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-op-supernodeArea: op-supernodeH-interopHardfork: change planned for interop upgrade

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions