-
Notifications
You must be signed in to change notification settings - Fork 707
cloud_storage_clients: rewrite pick_two_random_shards for O(1) sampling #29099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
cloud_storage_clients: rewrite pick_two_random_shards for O(1) sampling #29099
Conversation
Replace the shuffle-based algorithm with rejection-free direct sampling. Improvements: - O(1) instead of O(n) shuffle per call - Reuses thread-local RNG instead of creating new std::random_device and std::mt19937 on every invocation - Exactly one random draw per selection, no retries - No heap allocation (old version used std::vector) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
CI test resultstest results on build#78308
|
| [[nodiscard]] std::pair<ss::shard_id, ss::shard_id> pick_two_random_shards() { | ||
| using dist_t = std::uniform_int_distribution<ss::shard_id>; | ||
|
|
||
| static thread_local std::mt19937 gen{std::random_device{}()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gemini
Random Number Generation: std::random_device is often implemented as a read from /dev/urandom (syscall), which is too slow for a "Zero-Cost" abstraction, and std::mt19937 is large (2.5KB state).
Fix: Use a lightweight PRNG (like PCG or Xorshift) or simply a thread_local std::default_random_engine seeded once (not on every jitter call).
| [[nodiscard]] std::pair<ss::shard_id, ss::shard_id> pick_two_random_shards() { | ||
| using dist_t = std::uniform_int_distribution<ss::shard_id>; | ||
|
|
||
| static thread_local std::mt19937 gen{std::random_device{}()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use our random generator library? Travis updated it a few months ago to have different seeding policies to improve reproducibility, along with other improvements.
Replace the shuffle-based algorithm with rejection-free direct sampling.
Improvements:
🤖 Generated with Claude Code
Backports Required
Release Notes