Claude Code Context Window: Optimize Your Token Usage
Master Claude Code's context window for larger projects. Token optimization, /compact, task chunking, and recovery techniques.
Agentic Orchestration Kit for Claude Code.
Problem: Claude Code loses track of your project when conversations get too long. You end up re-explaining everything from scratch, or worse, Claude produces code that contradicts decisions made twenty minutes ago.
Quick Win: Watch the token percentage in Claude Code's status bar. When you hit 80%, exit the session and restart with claude for complex multi-file work:
This single habit prevents the performance degradation that kills productivity on large projects. The rest of this guide covers why it works and five more strategies that compound on top of it.
Before the strategies, watch what you're actually managing. Press play to run a simulated session: what loads before you type, what each file read costs, and where the reserved auto-compact buffer sits.
The Mental Model: Active Working Memory
Claude Code's context window isn't passive storage. It's active working memory. As it fills up, Claude struggles to maintain awareness across your entire project, the same way you'd struggle to hold a 50-page document in your head while writing code.
The practical consequence: context depletion causes four predictable failures.
- Inconsistent code that conflicts with earlier work in the same session
- Repeated questions about project structure you already explained
- Lost architectural decisions and naming conventions
- Breaking changes that ignore patterns established minutes ago
Understanding when these failures start matters more than trying to prevent them entirely.
The 80/20 Rule
Never use the final 20% of your context window for complex, multi-file tasks. Memory-intensive operations like refactoring, feature implementation, and debugging require substantial working memory to track relationships between components.
High Context Tasks (stop at 80% capacity):
- Large-scale refactoring across multiple files
- Feature implementation spanning several components
- Complex debugging requiring architectural understanding
- Code reviews with cross-file dependencies
Low Context Tasks (safe to continue past 80%):
- Single-file edits with clear scope
- Independent utility function creation
- Documentation updates
- Simple, localized bug fixes
The insight from experienced users: isolate your complex work to the first 80% of a session. Save the tail end for lightweight tasks that don't need Claude to hold your entire architecture in mind.
/compact and When to Use It
The /compact slash command summarizes and compresses your conversation history. On the first-party Anthropic API, this runs fast because Claude Code's Session Memory writes a running summary to disk in the background, so /compact loads that pre-written summary instead of re-analyzing the whole conversation from scratch. This is observed behavior rather than something Anthropic's official context window documentation describes, and it is feature-flag gated: Bedrock, Vertex, and Foundry users don't get it.
That background summary captures your session title and status, key results and decisions, open questions, and a running work log. It lives at ~/.claude/projects/<project-hash>/<session-id>/session-memory/summary.md, one summary file per session. Our Session Memory guide covers the extraction cadence and availability in full.
When to compact manually:
- After completing a major feature (before starting the next)
- Before switching from research mode to implementation mode
- When Claude starts repeating questions or contradicting earlier decisions
- At natural breakpoints between unrelated tasks
When NOT to compact:
- Mid-debugging, when specific error messages and stack traces matter
- During complex refactoring where file-level details are critical
- Right before integration work that depends on context from component building
When you do compact, you can steer what the summary keeps. Run /compact focus on the auth refactor, or any instruction, and the summary preserves what you name instead of what the automatic pass guesses is important. Anthropic documents this behavior in the official context window reference. That one habit is the difference between losing the thread of your current task and carrying it cleanly into the next stretch of work.
The goal: provide minimum context necessary for effective task execution. This maximizes performance, token efficiency, and cost simultaneously. For a complete reference of /compact, /clear, and every other slash command in the interactive interface, see the interactive mode guide.
For a deep dive into the compaction buffer mechanics, how the 33K token reservation works, and how to override the trigger threshold, see the context buffer management guide.
What Survives Compaction
Whether you run /compact or it fires automatically, compaction summarizes your conversation history to fit the window. What happens to your instructions depends entirely on how they were loaded, and Anthropic documented the exact rules. They are not intuitive.
The short version: anything loaded from disk at startup comes back, and anything that arrived through the conversation gets summarized away. Your project-root CLAUDE.md, your unscoped rules, and your auto memory (Claude's own notes, the first 200 lines or 25KB of MEMORY.md) are all re-injected from disk after compaction. A rule scoped with paths: frontmatter, or a nested CLAUDE.md in a subdirectory, is a different story: it entered context when Claude read a matching file, so compaction summarizes it away, and it stays gone until Claude reads a matching file again.
The practical takeaway: if an instruction must survive a mid-task compaction, put it in your project-root CLAUDE.md, not in a path-scoped rule. Invoked skills come back too, but truncated to fit a per-skill cap, so keep the load-bearing parts near the top of each SKILL.md.
For the full mechanism-by-mechanism table, the exact skill token caps, the one startup element that does not reload, and how disable-model-invocation keeps side-effect skills out of context entirely, see what survives compaction in Claude Code.
CLAUDE.md as Persistent Memory
Your CLAUDE.md file persists across sessions automatically. Use it for context that should never be lost:
Claude reads this file at session start, giving you free context that survives restarts. The key: keep it lean. Every token in CLAUDE.md is a token you can't use for conversation. Document only what Claude needs on every session, not everything about your project.
When establishing new patterns or making architectural decisions mid-session, explicitly preserve them:
This turns ephemeral session knowledge into persistent project knowledge. See our memory optimization guide for advanced strategies on keeping your persistent context efficient.
Strategic Task Chunking
Instead of pushing Claude to exhaustion, divide work into context-sized chunks with natural breakpoints.
Complete components before integration:
Finish research phases before implementation:
Between sessions, create handoff notes that capture everything the next session needs:
Store these notes in your project. When starting fresh, point Claude to them for continuity. The handoff technique is especially useful when you know you're about to hit context limits and want to preserve more detail than auto-compaction's lossy summary provides.
Delegate Heavy Reads to Subagents
The cheapest context is the context you never load. When a task needs Claude to read a pile of files it won't reference again, like researching how a subsystem works before you change it, hand that work to a subagent. A subagent runs in its own separate context window. It loads its own copy of your CLAUDE.md and the same skills and MCP servers, does the reading over there, and returns only its final summary to your main session.
Anthropic's own walkthrough makes the savings concrete: in their example, a subagent read 6,100 tokens of files and returned a 420-token result. The roughly 5,700 tokens of file content that would have crowded your main window never touched it. One caveat worth knowing: a subagent does not inherit your conversation history or your main session's auto memory, so give it enough context in the task prompt to work on its own.
This is the highest-leverage move for long sessions, and it's the pattern ClaudeFast's Code Kit is built around: a coordinating main thread that delegates exploration to specialist agents so its own context stays reserved for decisions and your conversation.
Context Recovery Techniques
When context loss happens mid-session, whether from compaction, a crash, or just too much drift, use this recovery sequence:
- Quick recovery: Reference your most recent checkpoint notes or session file
- Pattern review: Ask Claude to scan recent files and identify established patterns
- Architecture refresh: Provide a brief project overview focusing on current components
- Continuation strategy: Start with small, isolated tasks while context rebuilds
- Fresh start: Use
/clearwhen context becomes too corrupted to salvage. Sometimes starting clean with good handoff notes is faster than fighting degraded context
For long sessions (60+ minutes), proactive preservation beats reactive recovery. Run a quick context refresh every 30 minutes:
This prevents context drift where Claude gradually loses alignment with your project's specific requirements.
If you want to automate this entirely, ClaudeFast's Code Kit includes a ContextRecoveryHook that automatically backs up your session state before compaction occurs. It captures architectural decisions, established patterns, and current progress so nothing gets lost when the context window resets.
Monitoring Context Usage
Knowing when you're running low prevents surprises. Three approaches, from simple to advanced:
1. Watch the status bar. Claude Code shows token percentage at the bottom of the terminal. This is your primary signal. Build the habit of glancing at it before starting each new task.
2. Run /context for a breakdown. The /context command shows exactly where your tokens are going: system prompt, tools, memory files, skills, and conversation history. If your memory files are eating 15% of your window before you even start, that's a problem you can fix. Pair it with /memory, which lists the CLAUDE.md and rules files loaded this session and links to your auto memory folder, so you can catch a bloated memory file before it taxes you all session.
3. Use StatusLine for automated monitoring. StatusLine receives real-time context metrics and can trigger warnings at configurable thresholds. See the context buffer management guide for implementation details, including how to calculate your actual free space accounting for the compaction buffer.
One practical tip: if you're regularly hitting compaction before finishing your task, the answer usually isn't a bigger context window. It's smaller tasks. Break the work down further.
Constraints as Training
Working within token limits forces deliberate choices that make you a better Claude Code user:
- Explicit file selection: Include only relevant files, not entire codebases
- Clear task definition: Break objectives into concrete, actionable steps
- Priority-based organization: Structure prompts with critical details first
- Compact examples: Provide minimal but representative code samples
These skills transfer when context windows eventually expand. Developers who embrace constraints become better collaborators regardless of technical limits. If your plan supports it, the 1M context window gives you 5x the usable space with no pricing premium. It runs on Fable 5, Sonnet 5, Opus 4.6 and later, and Sonnet 4.6, and Sonnet 5 reaches 1M with no [1m] variant to select. Compaction works the same way at the larger limit, so the discipline from working within 200K still makes you more effective.
Next Actions
Immediate: Check your token percentage in the status bar. If above 80%, exit and restart for your next complex task.
This week: Practice chunking one large feature into component-then-integration phases.
Ongoing: Build your CLAUDE.md file with project context that persists across sessions. For the complete framework on engineering information flow to Claude, see context engineering.
Master context management and you handle projects 5x larger than developers who ignore these limits. If you want a pre-built system that applies these context min-maxing strategies automatically, with skills-first loading, sub-agent delegation for context conservation, and structured session recovery, ClaudeFast's Code Kit ships these patterns ready to use.
This is why top developers use "context resets" between planning and execution. Learn this and four other Claude Code best practices that compound over time.
Last updated on
