Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
5d84468
handle smigrating
nkaradzhov Nov 11, 2025
eab86c1
first approximation to handling smigrated
nkaradzhov Nov 13, 2025
fc06af6
deduplicate notifications based on sequence id
nkaradzhov Dec 2, 2025
44e4f02
add slotnumber to commands
nkaradzhov Jan 29, 2026
aa56883
add support for extracting commands from queue
nkaradzhov Dec 1, 2025
6e39e00
parse notification
nkaradzhov Dec 1, 2025
bb2a8cb
work on main algo
nkaradzhov Dec 2, 2025
71dbdc4
fix: handle string values in push message reply comparison
nkaradzhov Dec 2, 2025
3d55ca4
parse SMIGRATED according to new format
nkaradzhov Dec 2, 2025
a5d1b2c
comply with the new notification structure
nkaradzhov Dec 2, 2025
aa032d4
refine algo
nkaradzhov Dec 2, 2025
5e13dc4
handle pubSubNode replacement
nkaradzhov Dec 3, 2025
22b2050
tests: merge all `after` functions into one
nkaradzhov Dec 5, 2025
14fefb7
tests: add `testWithProxiedCluster()` function
nkaradzhov Jan 29, 2026
6d73f0e
Update index.ts
nkaradzhov Jan 29, 2026
3ee16fd
tests: add ProxyController for easier proxy comms
nkaradzhov Dec 5, 2025
bcbf3fb
fix: access private queue through _self proxy and guard client close …
PavelPashov Dec 10, 2025
013da8e
test(cluster): add fault injector infrastructure for hitless upgrade …
PavelPashov Dec 10, 2025
74c5144
feat(test-utils): add RE database management and test utilities
nkaradzhov Jan 29, 2026
11e8ae6
fix: fix command queue extraction and prepend logic
PavelPashov Dec 17, 2025
a7e76cf
test: add slot migration tests and refactor proxied fault injector
PavelPashov Dec 18, 2025
1a92c43
fix: wait for ALL ports while spawning proxied redis
nkaradzhov Jan 7, 2026
9ef9fce
fix: handle partial PubSubListeners in resubscribeAllPubSubListeners
nkaradzhov Jan 7, 2026
e19f5c1
refactor: maintenance tests and enhance fault injector client
nkaradzhov Jan 29, 2026
b547cb6
refactor: improve SMIGRATED push message parsing and add comprehensiv…
nkaradzhov Jan 29, 2026
4e8e571
refactor: #handleSmigrated: move source cleanup outside destinations …
nkaradzhov Jan 29, 2026
fd66749
refactor: add error handling to #handleSmigrated with try-catch-finally
nkaradzhov Jan 29, 2026
34835d1
refactor: replace hardcoded node ID 'asdff' with meaningful smigrated…
nkaradzhov Jan 29, 2026
e3589d7
fix: merge conflict residuals
nkaradzhov Jan 30, 2026
b9d332e
refactor: remove extra db deletion
nkaradzhov Feb 2, 2026
41af3ef
test: iterate over all trigger requirements and improve test naming
nkaradzhov Feb 3, 2026
85cde4f
uncomment tests
nkaradzhov Feb 3, 2026
d0a40e7
test: refactor test naming to use single baseTestName variable with i…
nkaradzhov Feb 3, 2026
e4de041
remove debug logs
nkaradzhov Feb 4, 2026
27d153f
fix: prevent PubSub subscription loss during cluster maintenance
nkaradzhov Feb 4, 2026
65c4323
Fix PubSub test hangs by awaiting publish batches
nkaradzhov Feb 4, 2026
491f5d6
Fix slot migration hangs during SMIGRATED handling
nkaradzhov Feb 4, 2026
5b573c9
improve FI debug logs
nkaradzhov Feb 6, 2026
e23cff1
implement unrelaxation
nkaradzhov Feb 6, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ node_modules/
dump.rdb
documentation/
tsconfig.tsbuildinfo
*.log
175 changes: 175 additions & 0 deletions CLUSTER-ARCHITECTURE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
Redis Cluster Architecture Diagram
===================================

┌─────────────────────────────────────────┐
│ RedisCluster (Root) │
│ │
│ - _options: RedisClusterOptions │
│ - _slots: RedisClusterSlots │
│ - _commandOptions │
│ │
│ Methods: │
│ + connect() │
│ + sendCommand() │
│ + MULTI() │
│ + SUBSCRIBE() / UNSUBSCRIBE() │
│ + SSUBSCRIBE() / SUNSUBSCRIBE() │
│ + close() / destroy() │
└──────────────┬──────────────────────────┘
│ contains
┌──────────────────────────┴──────────────────────────┐
│ │
│ RedisClusterSlots │
│ │
│ - slots: Array<Shard>[16384] │
│ - masters: Array<MasterNode> │
│ - replicas: Array<ShardNode> │
│ - nodeByAddress: Map<string, Node> │
│ - pubSubNode?: PubSubNode │
│ - clientSideCache?: PooledClientSideCacheProvider │
│ │
│ Methods: │
│ + connect() │
│ + getClient() │
│ + rediscover() │
│ + getPubSubClient() │
│ + getShardedPubSubClient() │
│ + getRandomNode() │
│ + getSlotRandomNode() │
└───────┬─────────────────┬─────────────────┬─────────┘
│ │ │
┌──────────┘ │ └─────────────┐
│ │ │
│ has many │ optionally has │ has many
▼ ▼ ▼
┌────────────────────────┐ ┌────────────────────────┐ ┌────────────────────────┐
│ Shard │ │ PubSubNode │ │ RedisClient │
│ │ │ │ │ (per node) │
│ - master: MasterNode │ │ - address: string │ │ │
│ - replicas?: Array │ │ - client: RedisClient │ │ Socket, Queue, etc. │
│ - nodesIterator │ │ - connectPromise │ │ │
└──────────┬─────────────┘ └────────────────────────┘ └────────────────────────┘
│ contains
┌────────────┴────────────┐
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ MasterNode │ │ ShardNode │
│ │ │ (replica) │
│ - id: string │ │ │
│ - host: string │ │ - id: string │
│ - port: number │ │ - host: string │
│ - address │ │ - port: number │
│ - readonly: no │ │ - address │
│ - client? │ │ - readonly: yes │
│ - pubSub? │ │ - client? │
│ └─> client │ │ │
│ └─> promise │ │ │
└──────────────────┘ └──────────────────┘


Additional Components:
─────────────────────

┌────────────────────────────────────┐
│ RedisClusterMultiCommand │
│ │
│ Used for MULTI/PIPELINE: │
│ - Batches commands │
│ - Routes to single node │
│ - Returns typed results │
│ │
│ Methods: │
│ + addCommand() │
│ + exec() │
│ + execAsPipeline() │
└────────────────────────────────────┘

┌────────────────────────────────────┐
│ PooledClientSideCacheProvider │
│ (BasicPooledClientSideCache) │
│ │
│ RESP3 Client-Side Caching: │
│ - Shared across all nodes │
│ - Invalidation tracking │
│ - TTL & eviction policies │
│ │
│ Methods: │
│ + get() / set() │
│ + invalidate() │
│ + clear() / enable() / disable() │
└────────────────────────────────────┘


Key Relationships:
─────────────────

1. RedisCluster
└─> RedisClusterSlots (manages topology)
└─> Shard[] (16,384 hash slots)
├─> MasterNode (read/write)
│ └─> RedisClient
│ └─> PubSub RedisClient (sharded pub/sub)
└─> ShardNode[] (replicas, read-only if useReplicas=true)
└─> RedisClient

2. RedisCluster
└─> RedisClusterMultiCommand (for transactions)

3. RedisClusterSlots
└─> PubSubNode (global pub/sub)
└─> RedisClient

4. RedisClusterSlots
└─> PooledClientSideCacheProvider (shared cache, RESP3 only)


Command Flow:
────────────

Single Command:
Client.sendCommand()
→ Cluster._execute()
→ Slots.getClient(key, isReadonly)
→ Calculate slot from key
→ Get Shard for slot
→ Return master or replica client
→ Client.sendCommand()
→ [If MOVED/ASK error]
→ Slots.rediscover()
→ Retry with new node

Transaction (MULTI):
Client.MULTI(routing)
→ RedisClusterMultiCommand
→ Accumulate commands
→ All commands must route to same node
→ client.exec()

Pub/Sub:
Global: Uses single PubSubNode
Sharded: Uses per-master pubSub client based on channel hash


Discovery & Failover:
─────────────────────

1. Initial Connect:
- Try rootNodes in random order
- Execute CLUSTER SLOTS command
- Build slot → shard mapping
- Create client connections

2. Rediscovery (on MOVED error):
- Clear cache
- Re-fetch CLUSTER SLOTS
- Update topology
- Reconnect clients to new nodes

3. Node Address Mapping:
- nodeAddressMap translates cluster IPs
- Useful for NAT/Docker scenarios
Loading
Loading