Skip to content

[feat][broker] PIP-468: scalable-topic seek + clear-backlog admin API#25696

Open
merlimat wants to merge 1 commit intoapache:masterfrom
merlimat:st-seek-clear-backlog
Open

[feat][broker] PIP-468: scalable-topic seek + clear-backlog admin API#25696
merlimat wants to merge 1 commit intoapache:masterfrom
merlimat:st-seek-clear-backlog

Conversation

@merlimat
Copy link
Copy Markdown
Contributor

@merlimat merlimat commented May 6, 2026

Summary

Two new operational primitives on a scalable-topic subscription, exposed as admin REST endpoints, admin client methods, and pulsar-admin CLI subcommands.

seek by wall-clock timestamp

Reset every per-segment cursor to a point in time. The controller uses each segment's recorded [createdAtMs, sealedAtMs) window to dispatch the cheapest per-segment op:

Segment relative to t Per-segment op
Sealed entirely before t skip-all (cursor → end)
Created entirely after t seek to timestamp=0 (earliest)
Alive at t seek to timestamp=t

clear-backlog

Dispatch skip-all on the subscription across every segment in the DAG.

Plumbing

  • Per-segment endpoints under /segments/.../subscription/{sub}/seek and .../skip-all (super-user, routed to segment owner). Call Subscription.resetCursor / clearBacklog under the hood.
  • Parent-topic endpoints under /scalable/.../subscriptions/{sub}/seek and .../skip-all, gated on RESET_CURSOR / SKIP authz, routed to the controller leader.
  • Admin client interface + impl pairs (sync/async): segment-level (seekSegmentSubscription, clearSegmentSubscriptionBacklog) and parent-level (seekSubscription, clearBacklog).
  • CLI:
    • pulsar-admin scalable-topics seek <topic> --subscription <s> --time 1h--time is a relative offset; the absolute timestamp passed to the broker is now - offset. Standard time-unit converter (1s, 5m, 1h, 5d, …).
    • pulsar-admin scalable-topics clear-backlog <topic> --subscription <s>.

Removals

Subscription seek is now an admin operation, not a consumer operation. The following V5 client surface goes away:

  • StreamConsumerBuilder.seek(MessageId) and seek(Instant) — were placeholder no-ops. Initial position is set via subscriptionInitialPosition(EARLIEST/LATEST); timestamp seek is the new admin call.
  • CheckpointConsumer.seek(Checkpoint) and the async counterpart. Connector frameworks restore from a saved checkpoint via CheckpointConsumerBuilder.startPosition(Checkpoint).
  • Checkpoint.atTimestamp(Instant) factory and the underlying TimestampCheckpoint type — timestamp positioning is the admin surface, not a checkpoint kind.
  • Checkpoint.creationTime() — was just metadata, not part of the position vector. Wire format simplifies accordingly. Connector frameworks that need timing can record it themselves.

Test plan

  • ScalableTopicControllerTest:
    • testSeekSubscriptionDispatchesPerSegmentByTimestamp — three segments at hand-picked timestamps (one fully before t, one straddling, one fully after); asserts the right per-segment admin call is issued for each.
    • testClearBacklogDispatchesSkipAllToEverySegment — N skip-all calls for N segments.
  • V5 checkpoint suites updated and green: V5CheckpointConsumerBasicTest, V5CheckpointConsumerDagReplayTest, V5CheckpointConsumerGroupTest, V5AsyncApisTest, CheckpointV5Test.
  • Checkstyle clean (pulsar-broker, pulsar-client-admin-api, pulsar-client-admin, pulsar-client-api-v5, pulsar-client-v5, pulsar-client-tools).

Two new operational primitives on a scalable-topic subscription, exposed
as admin REST endpoints, admin client methods, and pulsar-admin CLI
subcommands:

- seek by wall-clock timestamp: reset every per-segment cursor to a
  point in time. The controller uses each segment's recorded
  [createdAtMs, sealedAtMs) window to dispatch the cheapest per-segment
  op:
    - sealed entirely before t  -> skip-all (cursor to end)
    - created entirely after t  -> seek to timestamp 0 (earliest)
    - alive at t                -> seek to timestamp t
- clear backlog: dispatch skip-all on the subscription across every
  segment in the DAG.

Plumbing:
  - Per-segment endpoints in /segments/.../subscription/{sub}/seek and
    .../skip-all (super-user, route to segment owner). They call the
    standard Subscription.resetCursor / clearBacklog under the hood.
  - Parent-topic endpoints in /scalable/.../subscriptions/{sub}/seek
    and .../skip-all, gated on RESET_CURSOR / SKIP authz, routed to
    the controller leader.
  - Admin client interface + impl pairs (seekSegmentSubscription /
    clearSegmentSubscriptionBacklog and seekSubscription / clearBacklog).
  - CLI: `pulsar-admin scalable-topics seek <topic> --subscription <s>
    --time 1h` (relative offset, computed as now - offset) and
    `pulsar-admin scalable-topics clear-backlog <topic> -s <s>`.

Removals (subscription seek is now an admin operation, not a consumer
operation):
  - StreamConsumerBuilder.seek(MessageId / Instant) — were placeholder
    no-ops; gone. Initial position uses subscriptionInitialPosition;
    timestamp seek uses the new admin API.
  - CheckpointConsumer.seek(Checkpoint) and the async counterpart.
    Frameworks restore from a saved checkpoint via
    CheckpointConsumerBuilder.startPosition(Checkpoint).
  - Checkpoint.atTimestamp(Instant) factory and the underlying
    TimestampCheckpoint type — timestamp positioning is the admin
    surface, not a checkpoint kind.
  - Checkpoint.creationTime() — was just metadata, not part of the
    position vector. Connector frameworks that need timing record it
    themselves. Wire format simplifies accordingly.

Tests:
  - ScalableTopicControllerTest:
      testSeekSubscriptionDispatchesPerSegmentByTimestamp — three
      segments at hand-picked timestamps (one fully before t, one
      straddling, one fully after); asserts the right per-segment
      admin call is issued for each.
      testClearBacklogDispatchesSkipAllToEverySegment — N skip-all
      calls for N segments.
  - V5CheckpointConsumerBasicTest.testSeekRewindsToEarlierCheckpoint —
    removed; corresponding example in Examples.java removed.
  - V5AsyncApisTest.testAsyncCheckpointConsumerCheckpointAndSeek —
    slimmed to testAsyncCheckpointConsumerCheckpoint.
  - CheckpointV5Test — drop timestamp-roundtrip + creationTime
    assertions; constructor calls updated to single-arg.
@lhotari lhotari changed the title PIP-468: scalable-topic seek + clear-backlog admin API [feat][broker] PIP-468: scalable-topic seek + clear-backlog admin API May 6, 2026
@lhotari
Copy link
Copy Markdown
Member

lhotari commented May 6, 2026

[BUG] 404 conflates "segment topic not loaded" with "subscription not found", causing silent failures.

pulsar-broker/.../ScalableTopicController.java ~L671-675 (and the matching block in clearSubscriptionBacklogOnSegment ~L693-697) swallows PulsarAdminException.NotFoundException as success with the comment "No cursor on this segment yet — nothing to seek. Tolerated."

But the segment endpoint in Segments.java returns 404 for two distinct cases:

  1. Segment topic not loaded: … — the endpoint comment even says "callers can retry once the segment owner has loaded it".
  2. Subscription not found on segment: … — the case the controller actually intends to tolerate.

The admin client only sees the status code, not the message body, so a transient unload (segment owner restarting, ownership churn) during a parent-topic seek is silently swallowed and the caller thinks the seek succeeded across all segments — when in fact one or more were skipped entirely. For a write op like seek / clear-backlog this is a real correctness bug, not just a UX wart.

A few ways to fix:

  • Distinguish via distinct status codes on the segment endpoint (e.g. 503 Service Unavailable for not-loaded, 404 only for subscription-not-found).
  • Use getTopic() instead of getTopicIfExists() on the segment side so the owner loads on demand (matches how cursor-mutating ops on regular topics behave).
  • Carry a structured error in the response body and parse it admin-client-side.

The Javadoc on seekSubscription at L620 explicitly states "Subscription-not-found on a segment … is tolerated as success" — the implementation is broader than that contract.

@lhotari
Copy link
Copy Markdown
Member

lhotari commented May 6, 2026

[BUG] Empty active segments straddling t will fail the entire seek.

pulsar-broker/.../ScalableTopicController.java#seekSubscriptionOnSegment (~L646): for a freshly-created active segment that overlaps timestampMs but has no messages yet (just split, no producer activity), the classification falls through to seekSegmentSubscriptionAsync(name, sub, t). On the segment-owner broker, PersistentSubscription.resetCursor(t) calls PersistentMessageFinder.findMessages(t, …) which returns null (no entries match), and then cursor.getFirstPosition() is also null for an empty managed ledger — so the future completes exceptionally with SubscriptionInvalidCursorPosition (PersistentSubscription.java ~L887).

Because seekSubscription uses CompletableFuture.allOf (fail-fast) and there's no compensation, one empty straddling segment fails the whole parent-topic seek, with the cursor already partially repositioned on the segments that completed first. There's no rollback path.

Realistic worst case: split-segment just ran, the new child segments are active and empty, operator runs pulsar-admin scalable-topics seek to recover from a bad deploy → the seek fails with no clear hint that the issue is empty children.

Options:

  • Short-circuit in seekSubscriptionOnSegment when the segment is known to be empty (needs an entry-count signal, e.g. on the segment metadata or a cheap stat call).
  • Tolerate SubscriptionInvalidCursorPosition from segments the controller can verify are empty — distinguishing "empty" from "corrupted cursor" requires care.
  • At minimum, aggregate failures and surface which segment(s) failed instead of propagating the first one.

This is the kind of issue an end-to-end "produce → seek → consume" integration test across multiple segments would have caught — the new controller test only exercises mocks, so the managed-ledger reset-cursor path isn't covered at all.

@lhotari
Copy link
Copy Markdown
Member

lhotari commented May 6, 2026

The 2 comments above from a local Claude Code review

Copy link
Copy Markdown
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just check the local Claude Code review comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants