-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Summary
Split oversized block lists into sub-blocks to bound lookup time for hot accounts (accounts modified very frequently, e.g. token contracts or fee recipients).
Background
The archive index stores a block list per account per range. For most accounts this list is small. For hot accounts modified thousands of times within a single range, the list can exceed 32KB — loading and binary-searching it becomes expensive. Sub-blocking splits these large lists into fixed-size chunks with a descriptor index, so only the relevant chunk needs to be loaded.
Design
New DB segment
ARCHIVE_ACCOUNT_META [accountHash + rangeId] → [sub-block descriptors]
Each descriptor is 14 bytes: [maxBlock (8)] [entries (2)] [subBlockId (4)]. When a block list exceeds the threshold (~32KB / 4096 entries), it is split into sub-blocks keyed by subBlockId.
Write path
When appending a block number would push the block list past the threshold, ArchiveIndexWriter splits the existing list into sub-blocks and writes the descriptor index to ARCHIVE_ACCOUNT_META.
Read path
ArchiveIndexReader checks for sub-block descriptors first. If present, it binary-searches the descriptors to find the relevant sub-block and loads only that chunk. If absent (cold account), it loads the single block list as before.
Acceptance Criteria
- Block lists exceeding the threshold are split into sub-blocks
- Descriptors written to
ARCHIVE_ACCOUNT_META - Reader binary-searches descriptors and loads only the relevant sub-block
- Cold accounts (single block list) unaffected
- No regression in historical query correctness
- Benchmark shows bounded lookup time for known hot accounts (e.g. fee recipient)
Prerequisites
- Bonsai Archive: Range presence index and bloom filters #9985 merged (range index and bloom filters)
Related
- Bonsai Archive: Range presence index and bloom filters #9985 — Range presence index and bloom filters
Metadata
Metadata
Assignees
Labels
Type
Projects
Status