Overview
Analyze the total token budget an AI agent loads before the developer types anything, and recommend concrete trimming actions.
Problem
Developers accumulate CLAUDE.md content, skills, and MCP tool descriptions over time without visibility into the total context cost. A bloated context window degrades agent quality, increases latency, and burns through usage budgets — but there's no way to see the breakdown or know what to cut.
Scope
- Token audit: Count total tokens loaded per session across CLAUDE.md + all skills + all MCP tool descriptions
- Budget breakdown: Show per-source token usage ("CLAUDE.md: 3,200 tokens, Skills: 12,400 tokens, MCPs: 8,100 tokens")
- Trimming recommendations: Identify verbose sections, redundant instructions, and token-heavy skills with specific suggestions
- Context bloat detection: Flag skills that overlap, duplicate instructions, or are disproportionately large
- Cost estimation: Translate token counts into estimated monthly cost based on model pricing
CLI UX
$ caliber doctor
Context Budget Analysis
━━━━━━━━━━━━━━━━━━━━━━
Total tokens loaded per session: 47,200
CLAUDE.md 3,200 tokens (7%)
Skills (14 files) 28,400 tokens (60%)
MCP tools (5) 15,600 tokens (33%)
⚠ Recommendations:
• Skill "full-stack-debug.md" is 4,800 tokens — consider splitting or trimming
• 3 skills have overlapping instructions (deploy, ship, land-and-deploy)
• MCP "monday" exposes 47 tools — consider filtering to used tools only
Estimated cost impact: ~$42/month on Opus 1M, ~$4/month on Sonnet
Success Criteria
- Developer runs one command and understands their token budget
- Recommendations are actionable (specific files, specific sections)
- Cost estimates are grounded in real model pricing
Overview
Analyze the total token budget an AI agent loads before the developer types anything, and recommend concrete trimming actions.
Problem
Developers accumulate CLAUDE.md content, skills, and MCP tool descriptions over time without visibility into the total context cost. A bloated context window degrades agent quality, increases latency, and burns through usage budgets — but there's no way to see the breakdown or know what to cut.
Scope
CLI UX
Success Criteria