-
Notifications
You must be signed in to change notification settings - Fork 19
Open
Description
Problem
The CDC library has no mechanism to dynamically adjust its query rate based on cluster health. When the cluster is under heavy load (e.g., during consumer startup catch-up), workers continue issuing queries at full speed, worsening the overload.
Existing mechanisms (insufficient)
ExponentialRetryBackoffWithJitter— only triggers on errors, not proactively on rising latencyWorkerConfiguration.withMinimalWaitForWindowMs()— static delay, doesn't adapt to cluster load- Connection pool limits — cause
BusyPoolExceptionrejections rather than smooth throttling
Proposal
Implement adaptive backpressure that monitors cluster response latency and proactively slows down CDC queries before errors occur.
Suggested approach
- Track rolling average/p99 latency of CDC log queries within
TaskActionor a dedicated latency tracker - When latency exceeds a configurable threshold, introduce increasing delays between consecutive queries (using the existing non-blocking
delay()mechanism inTaskAction) - When latency returns to normal, gradually ramp back up to full speed
- New configuration options in
WorkerConfiguration.Builder:withBackpressureEnabled(boolean)(default:true)withBackpressureLatencyThresholdMs(long)— latency above which throttling kicks in (e.g.,500)withBackpressureMaxDelayMs(long)— maximum added delay when under pressure (e.g.,5000)
Key files
scylla-cdc-base/src/main/java/com/scylladb/cdc/model/worker/WorkerConfiguration.java— add config optionsscylla-cdc-base/src/main/java/com/scylladb/cdc/model/worker/TaskAction.java— instrument latency tracking and apply adaptive delaysscylla-cdc-lib/src/main/java/com/scylladb/cdc/lib/CDCConsumer.java— expose in the public builder API
Related
- Add adaptive backpressure based on cluster response latency scylla-cdc-source-connector#236 — connector-side issue for exposing this as connector config properties
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels