Description
I ran into a SQLite failure during large RxDB replications (150k+ documents) where findDocumentsById() builds a single WHERE id IN (?, ?, ...) query with more parameters than SQLite allows.
Here's the simplified error that I was getting:
Error: too many SQL variables
at Object.all (.../node_modules/rxdb-premium/dist/esm/plugins/storage-sqlite/sqlite-basics-helpers.js)
at n.all (.../node_modules/rxdb-premium/dist/esm/plugins/storage-sqlite/sqlite-storage-instance.js)
at n.findDocumentsById (.../node_modules/rxdb-premium/dist/esm/plugins/storage-sqlite/sqlite-storage-instance.js)
at async Promise.all (index 0)
at async Object.next (.../node_modules/rxdb/dist/esm/plugins/storage-remote/remote.js)
What I Tried
I tried lowering the replication batch size and batching reads from my tables, but the error still happened.
Runtime logging showed the oversized query was not coming from normal pull batches. Pull batches stayed around 900.
What Seems To Be Happening
The oversized lookup appears to come from replication metadata checks.
Path:
replication-protocol/upstream
persistToMaster()
getAssumedMasterState(state, docIds)
metaInstance.findDocumentsById(...)
- SQLite storage builds one large
IN (...) query
One runtime log showed:
collectionName: rx-replication-meta-...
idCount: 112500
shardCount: 2
maxShardIdCount: 61754
So even with sharding, one shard received more IDs than SQLite could accept.
Expected Behavior
RxDB SQLite storage should avoid creating SQL statements with more bound parameters than SQLite supports.
Large findDocumentsById() calls should probably be chunked internally and merged before returning to the caller.
Actual Behavior
findDocumentsById() receives a large ID array and passes the whole array into one SQLite IN (...) query, causing:
Error: too many SQL variables
Workaround That Fixed It Locally
I wrapped the storage instance and chunked findDocumentsById() calls before they reached SQLite.
Using chunks of 8000 IDs fixed the issue locally. The same large sync completed successfully after this change.
const SQLITE_FIND_BY_ID_SAFE_CHUNK_SIZE = 8000;
function chunkArray<T>(items: T[], chunkSize: number): T[][] {
const chunks: T[][] = [];
for (let index = 0; index < items.length; index += chunkSize) {
chunks.push(items.slice(index, index + chunkSize));
}
return chunks;
}
// RxDB can ask SQLite to fetch replication metadata for tens of thousands of
// documents at once. Split those ID lookups before they become a huge SQL IN list.
function withChunkedFindDocumentsById<Internals, InstanceCreationOptions>(
storage: RxStorage<Internals, InstanceCreationOptions>
): RxStorage<Internals, InstanceCreationOptions> {
return {
...storage,
async createStorageInstance<RxDocType>(
params: RxStorageInstanceCreationParams<RxDocType, InstanceCreationOptions>
): Promise<RxStorageInstance<RxDocType, Internals, InstanceCreationOptions>> {
const instance = await storage.createStorageInstance(params);
const originalFindDocumentsById = instance.findDocumentsById.bind(instance);
instance.findDocumentsById = async (ids, deleted) => {
if (ids.length > SQLITE_FIND_BY_ID_SAFE_CHUNK_SIZE) {
const chunks = chunkArray(ids, SQLITE_FIND_BY_ID_SAFE_CHUNK_SIZE);
const chunkResults = await Promise.all(
chunks.map((chunk) => originalFindDocumentsById(chunk, deleted))
);
return chunkResults.flat();
}
return originalFindDocumentsById(ids, deleted);
};
return instance;
}
};
}
Description
I ran into a SQLite failure during large RxDB replications (150k+ documents) where
findDocumentsById()builds a singleWHERE id IN (?, ?, ...)query with more parameters than SQLite allows.Here's the simplified error that I was getting:
Error: too many SQL variables at Object.all (.../node_modules/rxdb-premium/dist/esm/plugins/storage-sqlite/sqlite-basics-helpers.js) at n.all (.../node_modules/rxdb-premium/dist/esm/plugins/storage-sqlite/sqlite-storage-instance.js) at n.findDocumentsById (.../node_modules/rxdb-premium/dist/esm/plugins/storage-sqlite/sqlite-storage-instance.js) at async Promise.all (index 0) at async Object.next (.../node_modules/rxdb/dist/esm/plugins/storage-remote/remote.js)What I Tried
I tried lowering the replication batch size and batching reads from my tables, but the error still happened.
Runtime logging showed the oversized query was not coming from normal pull batches. Pull batches stayed around
900.What Seems To Be Happening
The oversized lookup appears to come from replication metadata checks.
Path:
replication-protocol/upstreampersistToMaster()getAssumedMasterState(state, docIds)metaInstance.findDocumentsById(...)IN (...)queryOne runtime log showed:
So even with sharding, one shard received more IDs than SQLite could accept.
Expected Behavior
RxDB SQLite storage should avoid creating SQL statements with more bound parameters than SQLite supports.
Large
findDocumentsById()calls should probably be chunked internally and merged before returning to the caller.Actual Behavior
findDocumentsById()receives a large ID array and passes the whole array into one SQLiteIN (...)query, causing:Workaround That Fixed It Locally
I wrapped the storage instance and chunked
findDocumentsById()calls before they reached SQLite.Using chunks of
8000IDs fixed the issue locally. The same large sync completed successfully after this change.