Skip to content

fix(native): validate ABI cache via child-process probe before trusting it#238

Open
husniadil wants to merge 1 commit intomksglu:nextfrom
husniadil:fix/validate-abi-cache
Open

fix(native): validate ABI cache via child-process probe before trusting it#238
husniadil wants to merge 1 commit intomksglu:nextfrom
husniadil:fix/validate-abi-cache

Conversation

@husniadil
Copy link
Copy Markdown

@husniadil husniadil commented Apr 8, 2026

Summary

Follow-up to #148. The ABI caching mechanism in ensureNativeCompat trusts cached binaries based solely on filename label (e.g. abi137.node) without verifying the binary actually matches the current Node ABI. This causes persistent FTS5 failures that survive session restarts.

Root cause

Two issues in the original probe logic:

  1. No validation after cache swap — if a binary was incorrectly cached under the wrong ABI label (e.g. Node 20 session caches ABI 115 binary as abi137.node), every subsequent Node 24 session silently fails.

  2. In-process require() can't detect swapped binaries — native .node modules are cached at the dlopen level (per-process). Additionally, require('better-sqlite3') only loads the JS wrapper; the native binary is lazy-loaded on first Database instantiation, so the original probe never actually triggered dlopen.

Fix

Probe via child process (node -e "new (require('better-sqlite3'))(':memory:').close()") which gets a fresh dlopen cache and triggers actual native binary loading. Falls through to npm rebuild if the cached binary is invalid.

Also:

  • Removed dead createRequire import
  • Added codesignBinary() after rebuild in the non-cache path (prevents SIGKILL on macOS hardened runtime)

Behavioral change

The old probe only triggered rebuild on NODE_MODULE_VERSION errors specifically. The new child-process probe treats any failure as needing rebuild — intentionally broader to handle edge cases (corrupt binaries, missing deps) at the cost of occasional unnecessary rebuilds (~5-30s).

Test plan

Unit tests

  • 6 new Vitest cases added to tests/hooks/ensure-deps.test.ts:
    • Corrupted ABI cache → detects invalid, rebuilds, re-caches
    • Valid ABI cache → fast path (no rebuild)
    • Missing cache + compatible binary → probes and creates cache
    • Missing cache + incompatible binary → rebuilds and caches
    • Corrupted cache + missing binary → recovers via rebuild
    • Graceful degradation → no throw when both probe and rebuild fail
  • npx vitest run tests/hooks/ensure-deps.test.ts — 14/14 pass
  • Tests also pass under Node 20 via mise exec node@20.19.4 -- npx vitest run tests/hooks/ensure-deps.test.ts

Real-world validation

Patched the installed plugin (~/.claude/plugins/cache/context-mode/context-mode/1.0.75/hooks/ensure-deps.mjs) and tested across live Claude Code sessions with mise-managed Node versions:

Step Node ABI Action ctx doctor FTS5
1 v24.11.0 137 Clean start (no ABI cache files) ✅ PASS — abi137.node cached
2 v20.18.0 115 Switch to Node 20 project dir ✅ PASS — auto-rebuild, abi115.node cached
3 v24.11.0 137 Switch back to Node 24 ✅ PASS — fast path from abi137.node cache

After test, both abi115.node and abi137.node coexist in build/Release/. Switching Node versions is seamless — no rebuild needed after first cache.

🤖 Generated with Claude Code

…ng it

Cached binaries were swapped in based solely on filename (e.g. abi137.node)
without verifying the binary actually matches the current Node ABI. This
caused persistent FTS5 failures when a binary compiled for one Node version
was incorrectly cached under another ABI label.

Two issues in the original ensureNativeCompat:

1. No validation after swapping in a cached binary — if the cache was
   stale/corrupt, every subsequent session would fail silently.

2. In-process require() can't detect on-disk binary changes because
   native .node modules are cached at the dlopen level. Additionally,
   require('better-sqlite3') only loads the JS wrapper — the native
   binary is lazy-loaded on first Database instantiation.

Fix: probe via child process (`node -e "new (require('better-sqlite3'))
(':memory:').close()"`) which gets a fresh dlopen cache and triggers
actual native binary loading. Falls through to npm rebuild if the
cached binary is invalid.

Also removes dead createRequire import and adds codesignBinary call
after rebuild in the non-cache path.

Adds 6 Vitest cases: corrupted cache recovery, valid cache fast path,
missing cache with compatible/incompatible binary, missing binary edge
case, and graceful degradation when both probe and rebuild fail.

Refs: mksglu#148

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant