Save 30-60% on Claude Code costs with proven strategies, real benchmarks, and copy-paste configs.
Claude Code is powerful - but costs add up fast. A single afternoon of heavy coding can burn through $20-50 in tokens. Most of this spend is avoidable with the right setup.
This repo is a collection of battle-tested strategies for reducing Claude Code costs without sacrificing quality. Every technique includes expected savings percentages based on real-world benchmarks.
Apply these 5 changes right now and cut costs immediately:
| # | Strategy | Expected Savings | Effort | Guide |
|---|---|---|---|---|
| 1 | Keep CLAUDE.md under 150 lines - every line loads on every turn | 10-20% | 5 min | Context Optimization |
| 2 | Use Haiku for simple tasks (--model haiku) - 5x cheaper than Opus |
20-40% | 1 min | Model Selection |
| 3 | Use Plan Mode before coding - prevents wasted iterative cycles | 15-25% | 0 min | Workflow Patterns |
| 4 | Add .claudeignore - stop Claude from reading node_modules, dist, lock files |
5-15% | 2 min | Context Optimization |
| 5 | Delegate to subagents - isolate expensive searches from main context | 20-40% | 5 min | Workflow Patterns |
Combined impact: 30-60% reduction in monthly Claude Code spend.
Understanding the billing model is the foundation of optimization.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Cache Hit (per 1M) | Context Window | Max Output |
|---|---|---|---|---|---|
| Opus 4.6 | $5.00 | $25.00 | $0.50 | 1M | 128K |
| Opus 4.6 (1M, >200K input) | $10.00 (2x) | $37.50 (1.5x) | $1.00 | 1M | 128K |
| Opus 4.6 Fast Mode | $30.00 (6x) | $150.00 (6x) | -- | 1M (included) | 128K |
| Sonnet 4.6 | $3.00 | $15.00 | $0.30 | 1M | 64K |
| Haiku 4.5 | $1.00 | $5.00 | $0.10 | 200K | 64K |
New in 2.1+:
--max-budget-usd <amount>caps spending per session.--fallback-model <model>auto-switches to a cheaper model when the primary is overloaded.
Plans: Pro $20/mo, Max 5x $100/mo, Max 20x $200/mo. Batch API: 50% discount. Cache write: 1.25x (5-min TTL), 2x (1-hour TTL).
Anthropic periodically runs temporary promotions that double usage limits during off-peak hours. These are not permanent -- check the Anthropic blog and support page for current promotions.
When active, peak hours (normal limits) are typically 8 AM - 2 PM ET. Everything outside that window + all weekends = 2x usage.
Time zone breakdown (click to expand)
| Time Zone | Peak (Normal Limits) | 2x Usage Window |
|---|---|---|
| US West (PT) | 5-11 AM | 11 AM - 5 AM + weekends |
| US East (ET) | 8 AM - 2 PM | 2 PM - 8 AM + weekends |
| UK (BST) | 1-7 PM | 7 PM - 1 PM + weekends |
| Central Europe (CET) | 2-8 PM | 8 PM - 2 PM + weekends |
| India (IST) | 6:30 PM - 12:30 AM | Entire workday is 2x |
| China/Singapore (SGT) | 9 PM - 3 AM | Entire workday is 2x |
| Japan/Korea (JST) | 10 PM - 4 AM | Entire workday is 2x |
| Australia (AEDT) | 12-6 AM | Entire workday is 2x |
Key insight: If you're outside the US, your entire workday typically falls in the 2x window during these promotions.
Every Claude Code turn = Input Tokens + Output Tokens
Input Tokens (you pay for):
├── System prompt (Claude Code's built-in instructions)
├── CLAUDE.md file (loaded every turn)
├── Conversation history (grows with each turn)
├── File contents (from Read/Grep/Glob results)
├── Tool call results
├── MCP server tool schemas (each server adds ~500-2000 tokens)
└── MCP server responses
Output Tokens (you pay more for):
├── Claude's text responses
├── Tool calls (Read, Edit, Bash, etc.)
├── Code generation
└── Plan mode analysis
Input tokens are charged every turn. If your CLAUDE.md is 300 lines and you have 50 turns in a session, you're paying for those 300 lines x 50 times. Cutting it to 100 lines saves you 2/3 of that recurring cost.
Claude Code uses prompt caching to reduce costs. Cached input tokens cost significantly less (e.g., $0.50/MTok vs $5.00/MTok on Opus 4.6 - a 90% discount). Content that stays the same between turns (like CLAUDE.md and system prompt) gets cached automatically.
What this means: The first turn is expensive, subsequent turns benefit from caching. Avoid breaking the cache by keeping stable content (CLAUDE.md, file reads) consistent between turns.
Deep-dive guides for each optimization area:
| Guide | What You'll Learn |
|---|---|
| 01 - Understanding Costs | How billing works, what costs the most, where money goes |
| 02 - Context Optimization | Reduce input tokens: CLAUDE.md, .claudeignore, file reads |
| 03 - Model Selection | When to use Opus vs Sonnet vs Haiku (with decision tree) |
| 04 - Workflow Patterns | Plan mode, subagents, commands, batch operations |
| 05 - Team Budgeting | Per-developer budgets, cost tracking, ROI calculation |
| 06 - Access Methods & Pricing | Compare API vs Bedrock vs Vertex AI vs Claude Code pricing |
| 07 - MCP & Agent Cost Impact | MCP server overhead, subagent costs, Agent SDK patterns |
Real-world cost measurements for common development tasks:
| Benchmark | What's Compared |
|---|---|
| Task Comparison | Same task with/without optimization - before vs after |
| Model Comparison | Opus vs Sonnet vs Haiku for different task types |
| Context Size Impact | How CLAUDE.md size and file reads affect total cost |
All benchmarks include methodology, raw numbers, and reproducible steps.
Copy-paste configs that are already optimized. Drop these into your project:
| Template | Lines | Best For | Link |
|---|---|---|---|
| Minimal | <50 | Maximum savings, solo projects | minimal.md |
| Standard | ~100 | Balanced cost/quality | standard.md |
| Comprehensive | ~150 | Full-featured, team projects | comprehensive.md |
| Monorepo | ~120 | Multi-package workspaces | monorepo.md |
| Stack | Link |
|---|---|
| React + Vite | react-vite.md |
| Next.js | nextjs.md |
| FastAPI + Python | fastapi-python.md |
| MERN Stack | mern.md |
| Terraform + AWS | terraform-aws.md |
| Config | Philosophy | Link |
|---|---|---|
| Cost-Conscious | Aggressive savings, Haiku default | cost-conscious.json |
| Balanced | Sonnet default, smart routing | balanced.json |
| Performance-First | Opus for complex, speed priority | performance-first.json |
| Command | Purpose | Link |
|---|---|---|
/cost-check |
Check current session usage | cost-check.md |
/budget-mode |
Force cost-conscious behavior | budget-mode.md |
/quick-fix |
Minimal-token bug fix | quick-fix.md |
Estimate how many tokens a file or prompt will cost before sending it to Claude:
python tools/token-estimator/estimate.py path/to/file.py
# Output: ~1,247 tokens | Estimated cost: $0.004 (Sonnet input)
python tools/token-estimator/estimate.py path/to/CLAUDE.md --per-turn 50
# Output: ~890 tokens per turn | 50 turns = 44,500 tokens | $0.13 (Sonnet)Analyze your Claude Code usage patterns to find cost hotspots:
python tools/usage-analyzer/analyze.py ~/.claude/projects/
# Output: Cost breakdown by project, session, and action typeOne-page reference: cheatsheet.md
We welcome contributions! See CONTRIBUTING.md for guidelines.
Ways to contribute:
- Share a tip: Open an issue using the Tip Submission template
- Add a benchmark: Run a test and submit results using the Benchmark Result template
- Add a template: Submit a CLAUDE.md template for a new stack
- Improve guides: Fix errors, add examples, clarify explanations
- Build a tool: Add utilities that help track or reduce costs
With the Pro plan ($20/mo), Max 5x ($100/mo), or Max 20x ($200/mo), you get included usage. After that, costs depend on your model and token usage. Heavy users report $3-15/day without optimization. With optimization, most drop to $1-5/day. Note that Opus 4.6 is now priced at $5/$25 - the same price Sonnet used to be - making top-tier model usage much more affordable.
Many strategies (context optimization, model selection, prompt engineering) apply to both Claude Code and the API. The templates and commands are Claude Code-specific.
No. The strategies focus on eliminating waste - duplicate context, unnecessary file reads, using expensive models for simple tasks. Quality stays the same or improves (because Claude has less noise to process).
Use the /usage command in Claude Code to see current session stats. For historical tracking, use our Usage Analyzer tool.
Switch to Haiku for routine tasks (formatting, simple fixes, file lookups). It's 5x cheaper than Opus and handles 70% of common coding tasks well. With Opus 4.6 now at $5/$25 (the same price Sonnet used to be), even the top model is much more accessible - but Haiku at $1/$5 still adds up to meaningful savings at scale.
If this repo helped you save money, consider giving it a star! It helps others find these resources.
MIT - use these strategies, templates, and tools however you want.