feat(audit): destructive-op JSONL trail for hard-deletes by vincedk-alt · Pull Request #1069 · garrytan/gbrain

vincedk-alt · 2026-05-16T07:44:10Z

Summary

Every hard-delete of a source or page row now leaves a forensic JSONL trace at ~/.gbrain/audit/destructive-ops-YYYY-Www.jsonl (ISO-week rotation, override via GBRAIN_AUDIT_DIR). The next "what deleted X?" investigation becomes one grep, not forensic excavation across cron logs + shell history.

Sibling of shell-audit.ts (shell-job submissions) and rerank-audit.ts (reranker failures). Same naming convention, same env override, same best-effort posture.

Why

When an operator runs gbrain sources remove --confirm-destructive, or any destructive code path fires (autopilot purge phase, pages purge-deleted, source-archive cascade), the only existing record of "what got deleted, when, from where" lives in agent session transcripts. That works when the action happens inside an agent harness. It fails — silently and permanently — when the action happens from a terminal, a cron, or an MCP client that doesn't persist per-call.

Concrete trigger: a 2026-05-15 cleanup pass cascade-hard-deleted 229 pages via sources_remove default --confirm-destructive. Reconstructing "what just happened?" required manually grepping a downstream agent's session transcripts the next day. Had the operation happened from a terminal, the event would have been permanently unknowable.

This PR closes that bug class. Operator-facing CHANGELOG framing: "destructive ops now leave a JSONL trail so you can answer 'what did I delete last Tuesday?' without rebuilding from cron logs."

What changed

New src/core/destructive-audit.ts module (mirrors shell-audit.ts + rerank-audit.ts). Exports:

computeDestructiveAuditFilename(now) — pure ISO-8601 week naming
resolveAuditDir() — env-override-aware
logDestructiveOp(event) — best-effort append with stderr warning on write failure
readRecentDestructiveOps(days) — newest-first reader, tolerates malformed JSONL

Wired into five hard-delete sites:

Site	Op kind
`pglite-engine.ts:deletePage`	raw delete primitive
`pglite-engine.ts:purgeDeletedPages`	autopilot purge phase + manual `pages purge-deleted`
`postgres-engine.ts:deletePage`	same on Postgres
`postgres-engine.ts:purgeDeletedPages`	same on Postgres
`destructive-guard.ts:purgeExpiredSources`	source-level cascade

Intentionally NOT wired: softDeletePage. Soft-deletes are reversible within the 72h recovery window and don't lose data; auditing them would be operational noise.

purgeDeletedPages with zero rows purged also skips the audit line. The autopilot cycle runs the purge phase every cycle; writing "purged 0 pages" 24+ times per day on a clean brain is pure disk churn.

Page-slug truncation: when page_slugs.length > 50, the array is sliced to the first 50 with a page_slugs_truncated: true marker. The pages_purged count remains accurate as ground truth. Stops one bulk-delete of 10K rows from producing a 10K-string JSONL line.

Sample output

{\"ts\":\"2026-05-16T05:30:12Z\",\"op\":\"deletePage\",\"engine\":\"pglite\",\"slug\":\"wiki/people/alice\",\"source_id\":\"default\"}
{\"ts\":\"2026-05-16T05:35:21Z\",\"op\":\"purgeDeletedPages\",\"engine\":\"pglite\",\"older_than_hours\":72,\"pages_purged\":7,\"page_slugs\":[\"wiki/a\",\"wiki/b\",\"...\"]}
{\"ts\":\"2026-05-16T05:40:01Z\",\"op\":\"purgeExpiredSources\",\"engine\":\"pglite\",\"sources_purged\":1,\"source_ids\":[\"default\"]}

Operator forensic query: tail ~/.gbrain/audit/destructive-ops-$(date +%Y-W%V).jsonl answers "what got destroyed this week?" instantly.

Tests

test/destructive-audit.test.ts — 14 cases, 43 expect calls, all passing:

Pure helpers: ISO-week filename W20 + cross-year W01 boundary; resolveAuditDir env override + default.

Write+read roundtrip (tmpdir GBRAIN_AUDIT_DIR): single roundtrip; newest-first ordering; truncation at 50 with marker; <=50 untruncated; malformed JSONL skipping; days-window filter.

Best-effort posture: unwritable audit dir → no throw (op continues).

End-to-end through PGLite engine: engine.deletePage emits expected line; engine.purgeDeletedPages emits expected line with all slugs; zero-row purge emits NO line (no churn regression guard).

Regression: existing 100 pglite-engine.test.ts + 35 orphans.test.ts + 24 sources-ops.test.ts cases all still pass.

Verification

bun run typecheck — clean
bun run verify — clean
bun test test/destructive-audit.test.ts — 14 pass / 0 fail
bun test test/pglite-engine.test.ts — 100 pass / 0 fail
bun test test/orphans.test.ts — 35 pass / 0 fail
bun test test/sources-ops.test.ts — 24 pass / 0 fail

Reviewer notes

Privacy: the audit log writes slugs but never page content, frontmatter, or chunk text. The slug is the smallest identifier that lets an operator reconstruct what happened.
Disk pressure: a single hard-delete = ~150 bytes JSONL. Even at 1000 deletes/week that's 150KB/year per brain — trivial.
No new dependencies. Pure Node fs + path.
Sibling pattern proof: shell-audit.ts has been in production since v0.20.4+ with this exact write/rotation shape; rerank-audit.ts joined in v0.35.0.0. This PR is the third audit-log file using the same idiom.

Follow-ups (NOT in this PR)

Optional doctor check destructive_ops_summary that surfaces "N hard-deletes in last 7 days" — opt-in; the audit file alone is enough for operator forensics.
Caller-context fields (auth.client_id, remote flag) require threading OperationContext below the engine layer — separate concern. Current trace has enough information for the bug class motivating this PR.

🤖 Generated with Claude Code

Every hard-delete of a source or page row now leaves a forensic JSONL trace at ~/.gbrain/audit/destructive-ops-YYYY-Www.jsonl (ISO-week rotation, override via GBRAIN_AUDIT_DIR). ## Why When an operator runs `gbrain sources remove default --confirm-destructive`, or any destructive code path fires, the only existing record of "what got deleted, when, and from where" lives in Claude Code session JSONL transcripts. That works when the action happens via a Claude Code session. It fails — silently and permanently — when the action happens from a terminal, a cron, or an MCP client that doesn't log per-call. Concrete example (2026-05-15 evening): a bulk `sources_remove` cascade hard-deleted 229 pages. Reconstructing "what just happened?" required manually grepping the Claude Code session transcripts the next day. Had the operation happened in a terminal, the event would have been permanently unknowable. This PR closes that bug class. Every future destructive op leaves a single JSONL line; the next "what deleted X?" investigation becomes one grep, not a forensic excavation across cron logs + shell history. ## What changed New `src/core/destructive-audit.ts` module — same naming convention, same `GBRAIN_AUDIT_DIR` override, same best-effort posture as the existing `shell-audit.ts` (shell-job submissions) and `rerank-audit.ts` (reranker failures). Exports: - `computeDestructiveAuditFilename(now)` — pure ISO-8601 week naming - `resolveAuditDir()` — env-override-aware audit dir resolver - `logDestructiveOp(event)` — best-effort append with stderr warning on write failure (disk-full attacker can't crash the destructive op itself; CHANGELOG calls this out as operational trace, not security) - `readRecentDestructiveOps(days)` — newest-first reader, tolerates malformed JSONL, filters by ISO-week filename prefix Wired into five hard-delete sites: - `pglite-engine.ts:deletePage` — raw delete primitive - `pglite-engine.ts:purgeDeletedPages` — autopilot purge phase + manual - `postgres-engine.ts:deletePage` — same on Postgres - `postgres-engine.ts:purgeDeletedPages` — same on Postgres - `destructive-guard.ts:purgeExpiredSources` — source-level cascade `softDeletePage` is intentionally NOT logged — soft-deletes are reversible within the 72h recovery window and don't lose data. Only operations that hard-delete data emit an audit line. `purgeDeletedPages` with zero rows purged also skips the audit line — the autopilot cycle runs the purge phase every cycle, and writing "purged 0 pages" 24+ times per day on a clean brain is pure churn. Page-slug list in audit events truncates at 50 (with a `page_slugs_truncated: true` marker) so a single hard-delete of 10K stale rows can't produce a 10K-string JSONL line. The `pages_purged` count stays accurate as ground truth. ## Sample output ```jsonl {"ts":"2026-05-16T05:30:12.456Z","op":"deletePage","engine":"pglite","slug":"wiki/people/alice","source_id":"default"} {"ts":"2026-05-16T05:35:21.789Z","op":"purgeDeletedPages","engine":"pglite","older_than_hours":72,"pages_purged":7,"page_slugs":["wiki/a","wiki/b","wiki/c","wiki/d","wiki/e","wiki/f","wiki/g"]} {"ts":"2026-05-16T05:40:01.123Z","op":"purgeExpiredSources","engine":"pglite","sources_purged":1,"source_ids":["default"]} ``` A future operator grep against `~/.gbrain/audit/destructive-ops-*.jsonl` reconstructs every hard-delete in the last N weeks without needing the Claude Code transcripts, terminal history, or cron logs. ## Tests `test/destructive-audit.test.ts` — 14 cases (43 expect calls): Pure helpers: - ISO-week filename lands at correct W20 / cross-year boundary W01 - `resolveAuditDir` honors env override + defaults to gbrainPath('audit') Write+read roundtrip (tmpdir GBRAIN_AUDIT_DIR): - logDestructiveOp + readRecentDestructiveOps roundtrip - Newest-first ordering across multiple writes - page_slugs > 50 truncated with marker (count stays accurate) - page_slugs <= 50 untruncated, no marker - Malformed JSONL lines skipped (partial-write tolerance) - Days-window filter Best-effort posture: - Audit dir unwritable → no throw (op continues) End-to-end through PGLite engine: - engine.deletePage emits the expected audit line - engine.purgeDeletedPages emits the expected line with all slugs - engine.purgeDeletedPages with zero rows = NO audit line (no churn) Regression: existing 100 pglite-engine.test.ts + 35 orphans.test.ts + 24 sources-ops.test.ts cases all still pass. ## Verification - `bun run typecheck` — clean - `bun run verify` — 5-check gate clean - `bun test test/destructive-audit.test.ts` — 14 pass / 0 fail - `bun test test/pglite-engine.test.ts` — 100 pass / 0 fail - `bun test test/orphans.test.ts` — 35 pass / 0 fail - `bun test test/sources-ops.test.ts` — 24 pass / 0 fail ## Open follow-ups (not in this PR) - Doctor check `destructive_ops_summary` that surfaces "N hard-deletes in the last 7 days" — opt-in; the audit file alone is enough for operator forensics. - Caller context (auth.client_id, remote flag) requires threading OperationContext below the engine layer — separate concern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vincedk-alt mentioned this pull request May 16, 2026

feat(audit): consolidate audit-log substrate behind a shared factory #1070

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(audit): destructive-op JSONL trail for hard-deletes#1069

feat(audit): destructive-op JSONL trail for hard-deletes#1069
vincedk-alt wants to merge 1 commit into
garrytan:masterfrom
vincedk-alt:feat/destructive-op-audit

vincedk-alt commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vincedk-alt commented May 16, 2026

Summary

Why

What changed

Sample output

Tests

Verification

Reviewer notes

Follow-ups (NOT in this PR)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant