Skip to content

[runtime] Lock null-store symmetry invariant in DurableExecutionManager#666

Open
weiqingy wants to merge 1 commit into
apache:mainfrom
weiqingy:645-impl
Open

[runtime] Lock null-store symmetry invariant in DurableExecutionManager#666
weiqingy wants to merge 1 commit into
apache:mainfrom
weiqingy:645-impl

Conversation

@weiqingy
Copy link
Copy Markdown
Collaborator

@weiqingy weiqingy commented May 13, 2026

Issue #645 originally described a memory leak in checkpointIdToSeqNums when actionStateStore == null: an unconditional put on the snapshot side paired with a guarded remove on the cleanup side, causing the map to grow unboundedly when durable execution was disabled.

That asymmetry was structurally fixed during the DurableExecutionManager extraction in #546, which added a symmetric actionStateStore == null early-return at the top of snapshotLastCompletedSequenceNumbers. The symmetry is now held by two independent conditionals with no test or javadoc tying them together — a future refactor that drops either guard would silently reintroduce the leak.

This commit locks the invariant in place:

  • Add a @VisibleForTesting accessor for checkpointIdToSeqNums mirroring the existing getActionStateStore() precedent.
  • Add a regression test that, with actionStateStore == null, runs two snapshot + notifyCheckpointComplete cycles and asserts the map stays empty. verifyNoInteractions on the mock KeyedStateBackend additionally proves the snapshot-side early-return fires before any backend access.
  • Strengthen javadoc on both methods to call out the symmetric guard invariant and cross-link to issue [Bug][runtime] Fix memory leak in DurableExecutionManager.checkpointIdToSeqNums #645.

No production behavior change.

Documentation

  • doc-needed
  • doc-not-needed
  • doc-included

@github-actions github-actions Bot added doc-not-needed Your PR changes do not impact docs fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. labels May 13, 2026
Issue apache#645 originally described a memory leak in checkpointIdToSeqNums
when actionStateStore == null: an unconditional put on the snapshot side
paired with a guarded remove on the cleanup side, causing the map to grow
unboundedly when durable execution was disabled.

That asymmetry was structurally fixed during the DurableExecutionManager
extraction in apache#546, which added a symmetric actionStateStore == null
early-return at the top of snapshotLastCompletedSequenceNumbers. The
symmetry is now held by two independent conditionals with no test or
javadoc tying them together — a future refactor that drops either guard
would silently reintroduce the leak.

This commit locks the invariant in place:

- Add a @VisibleForTesting accessor for checkpointIdToSeqNums mirroring
  the existing getActionStateStore() precedent.
- Add a regression test that, with actionStateStore == null, runs two
  snapshot + notifyCheckpointComplete cycles and asserts the map stays
  empty. verifyNoInteractions on the mock KeyedStateBackend additionally
  proves the snapshot-side early-return fires before any backend access.
- Strengthen javadoc on both methods to call out the symmetric guard
  invariant and cross-link to issue apache#645.

No production behavior change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant