Guide 02: Context Optimization

Every token in your input is a token you pay for. This guide covers practical strategies to reduce input tokens without sacrificing Claude's ability to help you. Applied together, these techniques can reduce input costs by 30-50%.

Why Input Tokens Matter
CLAUDE.md: The Line Budget
Before and After: CLAUDE.md Optimization
.claudeignore: Stop Indexing Junk
File Read Strategies
Using /compact to Reset Context
Subagents: Isolating Context-Heavy Work
Writing Concise Prompts
Memory Files vs Inline Instructions
Putting It All Together

Why Input Tokens Matter

Input tokens are the tokens sent to Claude on each turn. They include everything: the system prompt, your CLAUDE.md, the full conversation history, tool results, and your current message.

The critical insight is that input tokens are cumulative and recurring. Unlike output tokens (which are generated once), input tokens include all previous conversation history — so they grow with every turn and you pay for them repeatedly.

Turn  1 input:   4,500 tokens   (system + CLAUDE.md + your message)
Turn 10 input:  35,000 tokens   (all of the above + 9 turns of history)
Turn 30 input: 100,000 tokens   (all of the above + 29 turns of history)

Every token you can keep out of the input — by trimming CLAUDE.md, ignoring irrelevant files, avoiding unnecessary file reads, and compacting history — saves you money on every subsequent turn.

The math is straightforward: remove 1,000 tokens of recurring input, and over a 30-turn session you save 30,000 input tokens. At Sonnet pricing with 80% cache rate, that is about $0.03 per session. Do that across 5 sessions a day for a month, and it adds up to $3.30 from just that one cut. Now multiply by the 10-20 cuts this guide will show you.

CLAUDE.md: The Line Budget

Why Every Line Costs You Money

Your project's CLAUDE.md file is loaded in its entirety as part of the input on every single turn. It does not matter whether the current turn is about database schemas or CSS styling — the whole file is always there.

This makes CLAUDE.md the highest-leverage optimization target because:

It affects every turn in every session
It is under your direct control
Most CLAUDE.md files contain 2-3x more content than Claude actually needs

Hard Limits: 4,000 Characters Per File, 12,000 Total

Based on community research into Claude Code's internals, there are precise limits on instruction files that make the "keep it short" advice more concrete:

Per-file limit: Each instruction file (CLAUDE.md, CLAUDE.local.md, .claude/CLAUDE.md, .claude/instructions.md) is truncated at 4,000 characters. Content beyond this limit is silently dropped -- Claude never sees it.
Total budget: The combined content across all instruction files loaded for a session is capped at 12,000 characters. This budget is shared across every instruction file discovered by walking from the filesystem root to your current working directory.
Discovery order: Claude Code walks from the filesystem root to your cwd, checking each directory level for: CLAUDE.md, CLAUDE.local.md, .claude/CLAUDE.md, and .claude/instructions.md. Files with identical content across scopes are deduplicated automatically.

What this means in practice:

Scenario	Per-File Budget	Total Budget	Risk
Single project, one CLAUDE.md	4,000 chars	12,000 chars	Plenty of room
Monorepo with root + 2 workspace CLAUDE.md files	4,000 chars each	12,000 chars shared	Must keep all three under 12K combined
Nested dirs with parent CLAUDE.md files (e.g., `~/CLAUDE.md` + `~/code/CLAUDE.md` + project)	4,000 chars each	12,000 chars shared	Parent files eat into your budget silently

At roughly 7 characters per word and 10 words per line, 4,000 characters is approximately 57 lines of typical CLAUDE.md content. The old advice of "keep under 150 lines" assumed shorter lines -- with the 4,000-character hard cap, the real constraint is character count, not line count.

For monorepos: If your root CLAUDE.md is 3,500 characters, each workspace CLAUDE.md only has ~2,833 characters of headroom before hitting the 12,000-character combined limit across three files. Plan your instruction hierarchy accordingly.

The 150-Line Budget (Approximate Guideline)

We recommend keeping CLAUDE.md under 150 lines as an approximate guideline, but the hard limit is 4,000 characters per file. Here is the cost breakdown by size:

CLAUDE.md Size	Tokens Per Turn	Cost Per Turn (Sonnet)	30-Turn Session Cost	Monthly (110 sessions)
50 lines	~350	$0.001	$0.03	$3.30
100 lines	~700	$0.002	$0.06	$6.60
150 lines	~1,050	$0.003	$0.09	$9.90
300 lines	~2,100	$0.006	$0.18	$19.80
500 lines	~3,500	$0.011	$0.33	$36.30

The cost column assumes a blended rate with 80% cache hits. Actual savings from trimming are about 20% of the raw difference (since most of these tokens get cached), but the cache is not free — cached tokens still cost 10% of full price.

On Opus 4.6, multiply these numbers by ~1.67x (Opus input is $5/MTok vs Sonnet's $3/MTok). A 500-line CLAUDE.md on Opus 4.6 costs about $60.50/month just for the CLAUDE.md itself across 110 sessions.

What Belongs in CLAUDE.md

Include only information Claude needs on most turns:

Include	Why
Tech stack (language, framework, versions)	Affects every code suggestion
Build/test/lint commands	Claude runs these frequently
Project structure overview (5-10 lines max)	Helps Claude find files
Critical coding conventions	Prevents repeated corrections
Error handling patterns	Applies to most code written

What Does NOT Belong in CLAUDE.md

Exclude	Why	Where to Put It Instead
Detailed API documentation	Only relevant for API tasks	Separate docs file, reference as needed
Full directory tree	Stale quickly, Claude can use `ls`/`Glob`	Let Claude explore dynamically
Lengthy code examples	Takes many tokens for situational use	Nearby files in the codebase
Team member names/roles	Irrelevant to coding	Project wiki
Changelog/history	Never needed for code generation	CHANGELOG.md
Deployment procedures	Only relevant during deploy	docs/deployment.md
Commented-out alternatives	Adds tokens for no active benefit	Delete them
Aspirational rules not yet enforced	Confuses Claude	Add when enforced

The "Every Line" Audit

Go through your CLAUDE.md and ask for each line: "Does Claude need this on EVERY turn?"

If yes, keep it.
If "only sometimes," move it to a separate file or a custom command.
If "rarely," delete it.

Before and After: CLAUDE.md Optimization

Before: 380 Lines (Bloated)

# MyApp Project

## Overview
MyApp is an e-commerce platform built with React and Node.js. It was started in 2023
by the engineering team at Acme Corp. The platform serves over 10,000 customers
and processes approximately $2M in transactions monthly. We migrated from a legacy
PHP application in Q3 2023 and have been iterating on the platform since then.

## Team
- Alice (Tech Lead) — alice@acme.com
- Bob (Frontend) — bob@acme.com
- Carol (Backend) — carol@acme.com
- Dave (DevOps) — dave@acme.com

## Tech Stack
- Frontend: React 18.2.0 with TypeScript 5.3
- State Management: Redux Toolkit 2.0
- Styling: Tailwind CSS 3.4 with custom design system
- Build Tool: Vite 5.0
- Backend: Node.js 20 LTS with Express 4.18
- Database: PostgreSQL 16 with Prisma ORM 5.7
- Cache: Redis 7.2
- Message Queue: RabbitMQ 3.12
- Search: Elasticsearch 8.11
- Authentication: Passport.js with JWT
- API Documentation: Swagger/OpenAPI 3.0
- Monitoring: Datadog APM + custom dashboards
- CI/CD: GitHub Actions
- Hosting: AWS (ECS Fargate + RDS + ElastiCache)
- CDN: CloudFront

## Directory Structure

src/ ├── client/ │ ├── components/ │ │ ├── atoms/ │ │ │ ├── Button/ │ │ │ ├── Input/ │ │ │ ├── Text/ │ │ │ ├── Icon/ │ │ │ └── Badge/ │ │ ├── molecules/ │ │ │ ├── SearchBar/ │ │ │ ├── ProductCard/ │ │ │ ├── CartItem/ │ │ │ └── NavLink/ │ │ ├── organisms/ │ │ │ ├── Header/ │ │ │ ├── Footer/ │ │ │ ├── ProductGrid/ │ │ │ ├── ShoppingCart/ │ │ │ └── CheckoutForm/ │ │ └── pages/ │ │ ├── Home/ │ │ ├── ProductDetail/ │ │ ├── Cart/ │ │ ├── Checkout/ │ │ ├── Account/ │ │ └── Admin/ │ ├── hooks/ │ ├── store/ │ │ ├── slices/ │ │ └── middleware/ │ ├── services/ │ ├── utils/ │ └── types/ ├── server/ │ ├── routes/ │ ├── controllers/ │ ├── models/ │ ├── middleware/ │ ├── services/ │ └── utils/ └── shared/ ├── types/ └── constants/


## Coding Conventions

### General Rules
- Use TypeScript strict mode everywhere
- No `any` types allowed — use `unknown` if type is truly unknown
- Prefer `const` over `let`, never use `var`
- Use early returns to reduce nesting
- Maximum function length: 50 lines
- Maximum file length: 300 lines
- Use descriptive variable names (no single-letter variables except in loops)
- All functions must have JSDoc comments
- All exported functions must have unit tests
- Use absolute imports with path aliases (@client/, @server/, @shared/)
- Handle all errors explicitly — no empty catch blocks
- Log errors with structured logging (winston)
- Use enums for fixed sets of values
- Prefer composition over inheritance
- Use dependency injection where possible

### React Conventions
- Functional components only (no class components)
- Use custom hooks for shared logic
- Props interfaces must be exported and named ComponentNameProps
- Use React.memo for expensive renders
- Lazy load routes with React.lazy
- Use Suspense with fallback components
- Event handlers should be prefixed with "handle" (handleClick, handleSubmit)
- Use controlled forms with react-hook-form
- Use zod for form validation
- Separate business logic from UI components
- Use error boundaries for each route

### Backend Conventions
- All routes must have input validation using zod
- Use middleware for auth, logging, error handling
- Controller functions should be thin — delegate to services
- Services contain business logic
- Models define database schema and relationships
- Use transactions for multi-table operations
- Return consistent error responses: { error: string, code: number }
- Rate limit all public endpoints
- Use pagination for list endpoints (limit/offset)

### Database Conventions
- Use Prisma migrations for all schema changes
- Name tables in snake_case plural (user_accounts, order_items)
- Name columns in snake_case (created_at, updated_by)
- Always include created_at and updated_at timestamps
- Use UUIDs for primary keys
- Add indexes for frequently queried columns
- Use soft deletes (deleted_at) instead of hard deletes

### Testing
- Unit tests: Vitest + React Testing Library
- E2E tests: Playwright
- Minimum 80% code coverage
- Test file naming: *.test.ts or *.test.tsx
- Use factories for test data (src/test/factories/)
- Mock external services in tests
- Integration tests for API routes
- Run tests: npm test (unit), npm run test:e2e (e2e)
- Run specific test: npm test -- --grep "test name"

### Git Conventions
- Branch naming: feature/TICKET-123-description, fix/TICKET-456-description
- Commit messages: type(scope): description (conventional commits)
- Squash merge to main
- Require PR reviews from at least one team member
- Run CI checks before merge

## Build Commands
- npm run dev — Start development server (Vite + Express)
- npm run build — Production build
- npm test — Run unit tests
- npm run test:e2e — Run Playwright e2e tests
- npm run lint — ESLint + Prettier check
- npm run lint:fix — Auto-fix linting issues
- npm run type-check — TypeScript type checking
- npm run db:migrate — Run Prisma migrations
- npm run db:seed — Seed database with test data
- npm run db:studio — Open Prisma Studio
- npm run storybook — Open Storybook
- npm run analyze — Bundle size analysis

## API Endpoints (Current)
- POST /api/auth/login — User login
- POST /api/auth/register — User registration
- GET /api/products — List products (paginated)
- GET /api/products/:id — Product detail
- POST /api/cart — Add to cart
- GET /api/cart — Get cart
- PUT /api/cart/:itemId — Update cart item
- DELETE /api/cart/:itemId — Remove from cart
- POST /api/orders — Create order
- GET /api/orders — List user orders
- GET /api/orders/:id — Order detail
- GET /api/admin/dashboard — Admin dashboard stats
- ... (20 more endpoints)

## Recent Changes
- 2024-01-15: Migrated from Redux to Redux Toolkit
- 2024-01-10: Added Elasticsearch for product search
- 2024-01-05: Upgraded to Node 20 LTS
- 2023-12-20: Added Playwright E2E tests
- 2023-12-15: Implemented real-time notifications with WebSockets

## Known Issues
- Cart total calculation sometimes rounds incorrectly (TICKET-789)
- Search indexing can lag by up to 30 seconds (TICKET-823)
- Admin dashboard slow with >10k orders (TICKET-856)

## Deployment
### Staging
1. Push to staging branch
2. GitHub Actions runs tests
3. Builds Docker image
4. Deploys to ECS Fargate staging cluster
5. Runs smoke tests

### Production
1. Create PR from staging to main
2. Require 2 approvals
3. Merge triggers production pipeline
4. Blue-green deployment via ECS
5. Automatic rollback on health check failure
6. Post-deploy: verify monitoring dashboards

Token count: ~2,660 tokens per turn. Over 30 turns with 80% caching: ~$0.048 (Sonnet 4.6) / ~$0.080 (Opus 4.6)

After: 62 Lines (Optimized)

# MyApp — E-commerce Platform

Tech: TypeScript strict, React 18, Redux Toolkit, Tailwind, Vite 5
Backend: Node 20, Express, PostgreSQL 16 + Prisma, Redis, RabbitMQ
Auth: Passport.js + JWT | Search: Elasticsearch | Hosting: AWS ECS

## Commands
- `npm run dev` — Dev server (Vite + Express)
- `npm run build` — Production build
- `npm test` — Unit tests (Vitest + RTL)
- `npm run test:e2e` — E2E tests (Playwright)
- `npm run lint:fix` — ESLint + Prettier autofix
- `npm run type-check` — TypeScript checks
- `npm run db:migrate` — Prisma migrations

## Structure
- `src/client/` — React frontend (components/, hooks/, store/, services/)
- `src/server/` — Express backend (routes/, controllers/, models/, services/)
- `src/shared/` — Shared types and constants

## Code Rules
- No `any` — use `unknown` if needed
- Functional components only, use custom hooks for shared logic
- Props: export interface ComponentNameProps
- Early returns, max 50-line functions, max 300-line files
- Use @client/, @server/, @shared/ path aliases
- Controlled forms: react-hook-form + zod validation
- All routes: zod input validation, consistent { error, code } responses
- Thin controllers → services for business logic
- Use transactions for multi-table operations
- Soft deletes (deleted_at), UUIDs for PKs, snake_case tables/columns
- All exported functions need unit tests, 80% coverage minimum

## Tests
- Unit: `npm test -- --grep "name"` | E2E: `npm run test:e2e`
- Test data factories in `src/test/factories/`
- Mock external services in tests

## Git
- Branches: feature/TICKET-123-desc, fix/TICKET-456-desc
- Commits: type(scope): description (conventional commits)
- Squash merge, 1+ review required

Token count: ~434 tokens per turn. Over 30 turns with 80% caching: ~$0.008 (Sonnet 4.6) / ~$0.013 (Opus 4.6)

What Was Cut and Why

Removed Content	Lines Saved	Reason
Company overview and team info	12 lines	Irrelevant to code generation
Full directory tree	40 lines	Claude can explore with `ls`/`Glob`; trees go stale
Detailed component breakdown	8 lines	Structure summary is sufficient
API endpoint listing	20 lines	Claude can read route files directly
Deployment procedures	15 lines	Only needed during deploys (use a custom command)
Recent changes / changelog	6 lines	Not needed for writing code
Known issues	4 lines	Reference tickets when relevant, not every turn
Monitoring/CDN/CI details	5 lines	Only relevant for infra tasks
Redundant/verbose phrasing	~50 lines	Compressed into terse, high-density format
Storybook, bundle analysis commands	2 lines	Rarely used, not needed every turn

Savings Summary

Metric	Before	After	Improvement
Lines	380	62	84% fewer
Tokens per turn	~2,660	~434	84% fewer
30-turn Sonnet cost	$0.048	$0.008	$0.040 saved/session
30-turn Opus 4.6 cost	$0.080	$0.013	$0.067 saved/session
Monthly Opus 4.6 cost (110 sessions)	$8.80	$1.43	$7.37 saved/month

.claudeignore: Stop Indexing Junk

What .claudeignore Does

When Claude Code searches your project (via Glob or Grep), it can discover and read files that add tokens to your context but provide no value. The .claudeignore file (placed at your project root) tells Claude Code to skip these paths entirely.

This works the same way as .gitignore — same glob pattern syntax.

Why It Matters

Without .claudeignore, a Glob search for **/*.js in a Node.js project might return thousands of results from node_modules/. Even if Claude does not read them all, the search results themselves consume tokens, and Claude may waste turns exploring irrelevant files.

Recommended .claudeignore

# Dependencies — thousands of files Claude never needs to read
node_modules/
vendor/
bower_components/
.pnpm-store/

# Build output — generated files, not source code
dist/
build/
out/
.next/
.nuxt/
.svelte-kit/
.vercel/
.netlify/
target/
bin/
obj/

# Lock files — huge, machine-generated, not useful for Claude
package-lock.json
yarn.lock
pnpm-lock.yaml
Gemfile.lock
poetry.lock
composer.lock
Cargo.lock
go.sum

# Generated / compiled assets
*.min.js
*.min.css
*.bundle.js
*.chunk.js
*.map
*.d.ts

# Test artifacts
coverage/
.nyc_output/
test-results/
playwright-report/
__snapshots__/

# Caches
.cache/
.parcel-cache/
.eslintcache
.tsbuildinfo
*.pyc
__pycache__/
.pytest_cache/

# Version control internals
.git/

# Environment and secrets
.env
.env.*
*.pem
*.key

# IDE files (usually not needed)
.idea/
.vscode/settings.json
*.swp
*.swo

# Large data files
*.sqlite
*.db
*.sql.gz
*.csv
*.parquet

# OS files
.DS_Store
Thumbs.db
desktop.ini

# Logs
*.log
logs/

Measuring the Impact

You can estimate how much .claudeignore saves by checking what Claude would otherwise find:

# Count how many files Claude would index without .claudeignore
find . -type f | wc -l

# Count how many are in node_modules alone
find ./node_modules -type f 2>/dev/null | wc -l

# A typical React project: 30,000+ files in node_modules vs ~200 source files

In a project with node_modules/ containing 30,000 files, a single Glob search returning even 100 results from dependencies adds ~500-2,000 tokens of useless context per search. Over a 30-turn session with multiple searches, this adds up to 5,000-20,000 wasted tokens.

File Read Strategies

The Problem with Full File Reads

When Claude uses the Read tool on a file, the entire file content becomes part of the conversation history. A 500-line file is approximately 5,000 tokens — and those tokens persist for every remaining turn.

Reading a 500-line file on turn 5 of a 30-turn session:
= 5,000 tokens x 25 remaining turns
= 125,000 extra input tokens
= $0.375 on Sonnet (uncached) or ~$0.075 (with 80% caching)

Reading three large files carelessly can add more cost than your entire CLAUDE.md.

Strategies to Minimize File Read Costs

1. Point Claude to specific locations

Instead of:

Look at the user service and fix the bug

Use:

Fix the null pointer in src/services/userService.ts — the getUserById function around line 45

Claude will read only the relevant section instead of the entire file.

2. Reference functions by name

Read the processPayment function in src/services/paymentService.ts

Claude can use Grep to find the function and read only the surrounding lines rather than the full file.

3. Use line ranges for large files

If you know where the relevant code is, tell Claude:

Read lines 120-180 of src/models/Order.ts — that's the calculateTotal method

4. Let Claude search instead of read

For finding patterns across files, Grep is much cheaper than reading entire files:

Search for all uses of deprecated_function across the src/ directory

Grep results are typically 50-200 tokens vs 2,000-10,000 for full file reads.

5. Avoid "read and understand" prompts for large files

Instead of:

Read src/config/routes.ts and tell me about the route structure

Use:

What routes are defined in src/config/routes.ts? Just list the paths and HTTP methods.

This signals Claude to scan efficiently rather than ingest the whole file into a detailed analysis.

File Read Cost Reference

File Size	Tokens	Per-Turn Cost (Sonnet)	30-Turn Carry Cost
50 lines	~500	$0.0015	~$0.009
100 lines	~1,000	$0.003	~$0.018
300 lines	~3,000	$0.009	~$0.054
500 lines	~5,000	$0.015	~$0.090
1,000 lines	~10,000	$0.030	~$0.180
2,000 lines	~20,000	$0.060	~$0.360

"30-turn carry cost" = the total extra input cost of having that file in history for the remaining 30 turns at blended cache rate. Actual cost is lower with higher cache rates but these are useful upper-bound estimates.

Using /compact to Reset Context

What /compact Does

The /compact command tells Claude Code to summarize the entire conversation history into a condensed form. This replaces the full history with a much shorter summary, dramatically reducing input tokens for all subsequent turns.

Before /compact:
  Conversation history: 80,000 tokens (40 turns of detailed exchanges)

After /compact:
  Conversation summary: ~5,000-10,000 tokens (key decisions and context preserved)

Savings: 70,000-75,000 tokens of input per turn going forward

When to Use /compact

Signal	Action
Session exceeds 20 turns	Run `/compact`
`/usage` shows input tokens > 60K per turn	Run `/compact`
Claude seems slow to respond	Context may be large — run `/compact`
You are switching to a different area of the codebase	Run `/compact` (or start a new session)
Claude is "forgetting" earlier instructions	Context may be truncating — `/compact` + restate key context

When NOT to Use /compact

Signal	Why Not
You are in the middle of a multi-step operation	Claude needs the detailed history to continue correctly
You just referenced specific code from earlier turns	The summary may not preserve exact code details
Session is under 10 turns	Not enough history to justify the cost of compacting

The /compact Trade-Off

/compact is not free:

It costs tokens to generate the summary — Claude produces output tokens for the summary (an output cost)
It breaks the prompt cache — the conversation structure changes, so cached content needs to be re-cached
It loses detail — the summary is lossy; specific code snippets and exact phrasings may be lost

The rule of thumb: /compact pays for itself after 3-5 turns following the compaction. If you have fewer turns left, it may not be worth it. If you have 10+ turns left, it is almost always worth it.

Custom Compact Prompts

You can provide a focus hint when compacting:

/compact Focus on the authentication refactor — keep all decisions about JWT token structure and middleware changes.

This helps Claude prioritize what to preserve in the summary, reducing the risk of losing important context.

Subagents: Isolating Context-Heavy Work

How Subagents Save Tokens

When Claude Code spawns a subagent (via the Task tool), that subagent gets its own isolated context window. The key cost benefit: the subagent's detailed work does not pollute your main conversation history.

Without subagents:
  Main context: system + CLAUDE.md + history + [5 large file reads] + [50 grep results]
  Every subsequent turn carries those file reads and grep results

With subagents:
  Subagent context: system + CLAUDE.md + [5 large file reads] + [50 grep results]
  Main context: system + CLAUDE.md + history + [subagent's summary result]
  The detailed file contents stay in the subagent, not your main thread

What Comes Back to Main Context

When a subagent completes, only its final result is added to your main conversation. Not the files it read, not the searches it ran — just the answer. This is typically 100-500 tokens vs the 10,000-50,000 tokens the subagent consumed internally.

Best Use Cases for Subagents

Task	Why Subagent	Token Savings
Searching codebase for patterns	Grep results stay in subagent	2,000-10,000
Reading multiple files for analysis	File contents stay in subagent	5,000-30,000
Generating boilerplate	Output stays in subagent until summarized	1,000-5,000
Running and interpreting test output	Verbose test output stays in subagent	3,000-15,000
Exploring unfamiliar code	Discovery reads stay in subagent	5,000-20,000

How to Trigger Subagent Usage

Claude Code automatically uses subagents for certain complex tasks, but you can encourage it:

Search the entire src/ directory for all usages of the deprecated
calculateTotal function, and give me a summary of which files need updating.
Don't read the full files — just tell me file names and line numbers.

By asking for a summary, you signal Claude to delegate the heavy search to a subagent and return only the distilled result.

When Subagents Are Not Worth It

Very short tasks — The overhead of spawning a subagent (separate context initialization) may cost more than just doing it inline for simple lookups
Tasks requiring tight interaction — If the subagent's result determines your next 5 prompts, the back-and-forth negates the isolation benefit
Already-small context — If your main context is under 20K tokens, isolation savings are minimal

Writing Concise Prompts

Why Prompt Length Matters

Your prompts are typically 20-200 tokens, which seems small compared to system prompts and history. But concise prompts have a second-order benefit: they produce shorter conversations, which means less history accumulation.

A vague prompt leads to clarification questions, false starts, and iteration — each adding to history. A precise prompt often gets the right result in one turn.

Bad vs Good Prompts (With Token Impact)

Example 1: Bug Fix

Bad (47 tokens):

Hey Claude, so I've been having this issue where the app crashes sometimes
when users try to log in. I think it might be related to the auth middleware
but I'm not sure. Can you take a look and see what might be going on?

Good (28 tokens):

Fix the null reference crash in src/middleware/auth.ts — req.user is
undefined when the JWT token is expired. Add a null check before line 23.

The bad prompt will trigger Claude to: search for auth files, read multiple files, ask clarifying questions, and guess at the issue — costing 3-5 turns. The good prompt leads to a 1-turn fix.

Estimated cost difference: $0.08 vs $0.02 (Sonnet) for the same outcome.

Example 2: New Feature

Bad (62 tokens):

I need a new endpoint for the admin panel. It should let admins search through
user accounts with various filters. We need to be able to filter by name,
email, role, and registration date. It should support pagination too.
Also make sure it has proper authentication. Oh and add tests.

Good (51 tokens):

Add GET /api/admin/users endpoint in src/server/routes/admin.ts:
- Query params: name, email, role, registered_after, registered_before, page, limit
- Require admin role (use existing adminAuth middleware)
- Return paginated { users, total, page, limit }
- Add tests in src/server/routes/__tests__/admin.test.ts

The good prompt is actually fewer tokens and more specific. Claude can implement it in 1-2 turns instead of 3-5 because there is no ambiguity.

Prompt Conciseness Rules

Name exact files and paths — "in src/services/auth.ts" not "in the auth service"
Reference line numbers when possible — "around line 45" saves a file search
Specify the expected pattern — "return { error, code } format" not "handle errors properly"
Batch related changes — one prompt for 5 related edits beats 5 separate prompts
Skip pleasantries — "Fix X" not "Hey, could you possibly help me fix X?"
Use structured formats — bullet points and specs are more token-efficient than prose

Memory Files vs Inline Instructions

The Problem with Repeating Yourself

If you find yourself typing the same instruction across multiple sessions:

Remember to always use single quotes in TypeScript files

Make sure to add error handling with our standard AppError class

Run npm test after making changes

Each repetition costs ~15-30 tokens of input. Across 5 sessions a day, that is 75-150 wasted tokens per instruction per day. Not huge individually, but these add up when you have 10+ such habits.

The Solution: Put It in CLAUDE.md Once

## Code Rules
- Single quotes in TypeScript
- Error handling: throw new AppError(message, statusCode)
- Run `npm test` after changes

Three lines in CLAUDE.md (~21 tokens) that are loaded once per turn via cache, replacing 30+ tokens of repeated ad-hoc instructions.

When to Use CLAUDE.md vs Separate Files

Instruction Type	Where	Why
Universal rules (applies to every task)	CLAUDE.md	Loaded every turn, always available
Module-specific patterns	Nested CLAUDE.md in that directory	Only loaded when working in that area
Rare/specialized workflows	Custom slash command	Only loaded when explicitly invoked
One-off task context	Your prompt	Does not persist beyond this session

Nested CLAUDE.md Files

Claude Code supports CLAUDE.md files in subdirectories. These are loaded only when Claude is working with files in that directory or below:

project/
├── CLAUDE.md                 ← Always loaded (keep this lean)
├── src/
│   ├── client/
│   │   └── CLAUDE.md         ← Loaded only for frontend work
│   └── server/
│       └── CLAUDE.md         ← Loaded only for backend work
└── infrastructure/
    └── CLAUDE.md             ← Loaded only for infra work

This lets you move domain-specific instructions out of the root CLAUDE.md (reducing its size) while still having them available when relevant.

Important: All loaded instruction files share the 12,000-character total budget. If your root CLAUDE.md is 3,000 characters and your src/client/CLAUDE.md is 2,500 characters, that is 5,500 characters of budget consumed when working in src/client/. Plan your hierarchy so that the files loaded for any given working directory stay well under 12K combined.

Example: moving frontend rules to src/client/CLAUDE.md

Root CLAUDE.md (before): 150 lines including 40 lines of React-specific rules Root CLAUDE.md (after): 110 lines src/client/CLAUDE.md: 40 lines of React rules -- only loaded during frontend work

Savings: 40 fewer lines in root CLAUDE.md = ~280 fewer tokens on every non-frontend turn.

Custom Slash Commands for Specialized Workflows

For workflows you run occasionally (deploying, database migrations, performance audits), create custom commands instead of putting instructions in CLAUDE.md:

.claude/commands/deploy.md:
---
Run the deployment checklist:
1. Run npm test and ensure all pass
2. Run npm run build and check for errors
3. Check that .env.production has all required variables
4. Run npm run db:migrate -- --dry-run to preview migrations
5. Report status of each step
---

Invoke with /deploy when needed. This keeps 20+ lines out of CLAUDE.md and only loads them when you explicitly ask.

Putting It All Together

The Context Optimization Checklist

Apply these in order of impact:

Expected Savings When Fully Applied

Strategy	Savings on Input	Effort to Implement
CLAUDE.md under 150 lines	10-20%	One-time, 15 minutes
.claudeignore configured	5-15%	One-time, 2 minutes
/compact usage	10-20% per long session	Ongoing habit
Precise file references	5-15%	Ongoing habit
Subagent delegation	10-25% on search-heavy sessions	Ongoing habit
Concise prompts	5-10%	Ongoing habit
Nested CLAUDE.md + commands	5-10%	One-time, 15 minutes
New sessions for new tasks	10-20%	Ongoing habit
Combined	30-50% reduction

These percentages compound. A developer who was spending $15/day on Sonnet can realistically drop to $7-10/day by applying all of these strategies — saving $110-176/month.

Next: Guide 03 - Model Selection — when to use Opus vs Sonnet vs Haiku, with a decision tree and cost comparisons for every task type.

FilesExpand file tree

02-context-optimization.md

Latest commit

History