-
Notifications
You must be signed in to change notification settings - Fork 19
Open
Description
Problem
When a CDC consumer starts up (especially for the first time), all workers begin querying the cluster simultaneously with no global concurrency control. The current execution model uses CompletableFuture chaining on a ScheduledExecutorService, meaning a single thread can have many outstanding async requests. The executor pool size (default: availableProcessors - 1) limits thread count but not in-flight queries.
On a cluster with 256 vnodes, this can result in 250+ concurrent CDC log readers overwhelming the cluster with queries.
Existing mechanisms (insufficient)
WorkerConfiguration.withMinimalWaitForWindowMs()— adds a static delay per worker but doesn't limit aggregate concurrency- Connection pool limits (
poolingMaxRequestsPerConnectionLocal,poolingMaxQueueSize) — act as backstops but allow far too many concurrent requests in aggregate ExponentialRetryBackoffWithJitter— only triggers on errors, by which point the cluster is already stressed
Proposal
Add a shared concurrency limiter (e.g., a Semaphore) across all workers to cap the number of concurrent in-flight CDC queries.
Suggested approach
- Add a configurable max concurrent queries option to
WorkerConfiguration.Builder(e.g.,withMaxConcurrentQueries(int max)) - Implement via a shared
Semaphorethat workers acquire before issuing a CDC query inTaskActionand release on completion - A sensible default (e.g., 32–64) would prevent cluster overload while still allowing reasonable throughput
- The semaphore should be shared across all workers within a single
CDCConsumerinstance
Key files
scylla-cdc-base/src/main/java/com/scylladb/cdc/model/worker/WorkerConfiguration.java— add config optionscylla-cdc-base/src/main/java/com/scylladb/cdc/model/worker/TaskAction.java— add semaphore gating around query executionscylla-cdc-lib/src/main/java/com/scylladb/cdc/lib/CDCConsumer.java— expose in the public builder API
Related
- Add global concurrency limiter for CDC queries across all workers scylla-cdc-source-connector#235 — connector-side issue for exposing this as a connector config property
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels