docs: enterprise test plan with comprehensive gap analysis by kovtcharov-amd · Pull Request #63 · extropolis/claudia

kovtcharov-amd · 2026-05-13T16:15:42Z

Summary

Deep code analysis of the Claudia codebase (~41,400 LOC) identifying testing gaps and defining a phased test plan to reach enterprise-grade reliability.

Key findings

399 existing tests across 10 modules — all passing
~25,000+ LOC across 16 backend modules and 35 frontend components have zero test coverage
No integration tests, E2E tests, or performance tests exist
11 known production bugs need regression tests

Changes in this PR

docs/plans/enterprise-test-plan.md — comprehensive 6-phase test plan defining ~685 new tests
Accurate per-module LOC and test counts (verified against codebase)
Known Bug Regression Tests section with specific production failures

Revisions applied during review

Fixed per-module test counts (several were inaccurate)
Corrected total codebase size from ~31,600 to ~41,400 LOC
Added 5 missing backend modules (mobile-page, voice-supervisor, voice-agent-page, opencode-backend, usage-reporter)
Added electron package (523 LOC)
Corrected frontend component count from ~8,000 to 13,014 LOC
Added Known Bug Regression Tests section with 11 production bugs
Removed non-doc code changes (those belong in PR fix: buffer PTY output during resize to prevent text corruption #61)

Test plan phases

Phase	Tests	Priority
Phase 1: Backend Unit Tests	~315	P0
Phase 2: Server Integration	~160	P0
Phase 3: Frontend Tests	~95	P1
Phase 4: Integration & E2E	~35	P1
Phase 5: Robustness & Security	~45	P2
Phase 6: Performance	~10	P2
Known Bug Regressions	~25	P0
Total	~685 new

showBrowseButton was hardcoded to false, preventing users from using the native folder picker. Set to true so the Browse button appears and opens the OS folder dialog via the existing WebSocket handler. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add drag-and-drop support on the workspace panel: drop a folder from the OS file explorer to add it as a workspace. In Electron, the full path is extracted directly. In the browser, opens the path input modal. - Fix Browse button: use REST endpoint instead of blocking WebSocket execFileSync which froze the server. Only show Browse in Electron mode where the native dialog works reliably. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Root cause: syncWorkspaceMcpConfigs wrote .mcp.json to the claudia project root on every startup, triggering tsx watch to restart the server in an infinite loop. Now skips syncing to Claudia's own workspace directory. Also: - Fix Browse button: add -STA flag for Windows PowerShell folder dialog, remember last browsed path across sessions, kdialog fallback on Linux - Re-enable Browse button in Add Workspace dialog - Fix drag-and-drop: only activate for external OS drops (Files type), internal workspace reordering drags pass through unaffected Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The 512KB caps were too aggressive — users lost scrollback history after rotation. Disk files now cap at 10MB (rotate keeping 5MB tail), and clients receive up to 2MB of history for scrollback. Memory loading on reconnect remains capped at 512KB to prevent OOM. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When a scrollbar appears/disappears during active output, the container width changes by ~15px, flipping cols by 1-2. This caused Claude Code's TUI to re-render at alternating widths, producing overlapping garbled text. Fix: - Skip resize events where cols changed by <= 2 (scrollbar noise) - Track last sent cols/rows to deduplicate - Increase ResizeObserver debounce from 50ms to 150ms - Use fitTerminal() (fit + refresh) to clear artifacts after resize Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

References #59 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

After sending a resize to the backend, buffer incoming PTY output for 250ms. This gives the PTY time to process SIGWINCH and start rendering at the new width. Without buffering, output rendered at the OLD width arrives at xterm (already at the NEW width), causing ANSI cursor positioning commands to misalign and produce garbled overlapping text. The buffer accumulates output chunks during the transition, then flushes them all at once after the PTY has caught up. History tracking refs are updated even during buffering so scroll-up loading stays consistent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Deep analysis of the Claudia codebase identified ~15,000+ LOC across 12 backend modules and all frontend components with zero test coverage. The plan defines 6 phases covering ~660 new tests to reach enterprise-grade reliability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Three corruption fixes based on deep code review: 1. Buffer task:output during task:restore processing. Between term.reset() and term.write(history) completion, live output was being interleaved with the history replay, causing garbled overlapping text on every task click. Output is now queued and flushed after history write completes. 2. Same fix for loadEarlierChunkIfNeeded scroll-up rewrites — live output was interleaving with the reset+rewrite cycle. 3. Fix UTF-8 multi-byte character splitting in readTaskHistoryRange. When reading at arbitrary byte offsets, the read could start mid-character (e.g., byte 2 of a 3-byte '─' char), producing Unicode replacement characters. Now skips leading continuation bytes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…plan # Conflicts: # frontend/src/components/TerminalView.tsx

…ssions - Fix per-module test counts (several were fabricated, e.g., token-parser claimed 46 actual 30, validation claimed 57 actual 28) - Correct codebase size: 41,400 LOC (was 31,600 — 31% undercount) - Add 5 missing backend modules: mobile-page (2,088 LOC), voice-supervisor (424), voice-agent-page (642), opencode-backend corrected to 853 (was ~400), usage-reporter (68) - Add electron package (523 LOC) - Correct frontend component count: 35 components / 13,014 LOC (was ~8,000) - Add Known Bug Regression Tests section with 11 production bugs - Remove non-doc code changes (belong in separate PR #61)

Ovtcharov and others added 9 commits May 11, 2026 15:40

docs: add git worktree support design plan

0716881

References #59 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

kovtcharov-amd mentioned this pull request May 13, 2026

Enterprise test coverage: ~660 tests across 6 phases #64

Open

30 tasks

Ovtcharov added 2 commits May 13, 2026 10:41

Merge remote-tracking branch 'origin/main' into docs/enterprise-test-…

412c8e1

…plan # Conflicts: # frontend/src/components/TerminalView.tsx

kovtcharov-amd changed the title ~~docs: enterprise test plan with gap analysis~~ docs: enterprise test plan with comprehensive gap analysis May 13, 2026

kovtcharov-amd enabled auto-merge (squash) May 13, 2026 18:11

kovtcharov approved these changes May 13, 2026

View reviewed changes

kovtcharov-amd merged commit d8a5602 into main May 13, 2026
3 checks passed

kovtcharov-amd deleted the docs/enterprise-test-plan branch May 13, 2026 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: enterprise test plan with comprehensive gap analysis#63

docs: enterprise test plan with comprehensive gap analysis#63
kovtcharov-amd merged 11 commits into
mainfrom
docs/enterprise-test-plan

kovtcharov-amd commented May 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kovtcharov-amd commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key findings

Changes in this PR

Revisions applied during review

Test plan phases

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kovtcharov-amd commented May 13, 2026 •

edited

Loading