Claude Code Keeps Running Out of Context — Here's How I Fixed My Workflow
TL;DR
Claude Code's context window is finite, and most developers burn through it way too fast by reading entire files, chatting casually, and tackling monolithic tasks. Break work into focused sessions, use /compact aggressively, write a solid CLAUDE.md, stop saying 'thank you' mid-session, and use subagents to offload heavy exploration. Your context is a budget — spend it wisely.
I was three hours into a refactor — Claude Code and I were on a roll, ripping through components, updating types, wiring up new API routes — when I saw the message that makes every developer's stomach drop:
"Context window limit reached."
Three hours of shared understanding, gone. Claude forgot everything. The architecture decisions, the file structure we'd mapped out, the half-finished migration — all of it evaporated like my mass spectrometry samples used to evaporate when I left them out in the lab during my PhD. (Except those were accidents. This was preventable.)
I sat there staring at my terminal, contemplating whether to start over or just go outside. I chose neither and instead spent the next two weeks figuring out how to never end up in that situation again.
This post is everything I learned.
What Is the Context Window, Really?
Think of Claude Code's context window as a whiteboard. Everything — your messages, Claude's responses, file contents it reads, search results, tool outputs, error logs — gets written on this whiteboard. Once it's full, nothing else fits. Game over.
┌─────────────────────────────────────────────────────────┐
│ CONTEXT WINDOW │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Your │ │ Claude's │ │ File │ │
│ │ Messages │ │ Responses│ │ Reads │ │
│ │ │ │ │ │ │ │
│ │ ~15% │ │ ~30% │ │ ~25% │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Search │ │ Tool │ │ Error │ │
│ │ Results │ │ Outputs │ │ Logs │ │
│ │ │ │ │ │ │ │
│ │ ~10% │ │ ~10% │ │ ~10% │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ ████████████████████████████████░░░░░░░░ 85% FULL │
│ │
└─────────────────────────────────────────────────────────┘
The rough numbers vary, but you get the idea. That whiteboard fills up fast. And here's the part that surprised me: Claude's own responses are often the biggest context hog. When it writes a 200-line file, explains its reasoning, and then writes tests — that's a lot of whiteboard space consumed by output alone.
The Day I Lost Three Hours of Work
Let me tell you the full story because I think it's instructive (and because misery loves company).
I was building a new feature for a Next.js app — a multi-step form wizard with validation, API integration, and state management. Classic full-stack task. I started by asking Claude to read the existing codebase structure. It read about 15 files. Then I asked it to understand the current form patterns. More files. Then the API routes. More files.
We hadn't written a single line of new code and I'd already burned through probably 30% of my context window just on reconnaissance.
Then we started building. Claude was generating components, I was reviewing, asking for changes, it was reading test files to understand patterns... and before I knew it — boom. Context limit.
The Expensive Lesson
Exploration is the silent context killer. Every file read, every grep search, every "can you also check..." eats into your budget. If you don't have a plan before you start, you'll spend half your context just figuring out what to do.
The fix? I completely changed how I work with Claude Code. Here's the playbook.
Strategy 1: Break Everything Into Small, Focused Sessions
This is the single most impactful change I made. Instead of "build me this entire feature," I now think in atomic sessions:
❌ Old approach (one mega-session):
"Build a multi-step form with validation, API routes,
tests, and error handling"
→ Context explodes at step 3 of 7
✅ New approach (multiple focused sessions):
Session 1: "Create the form wizard component with step navigation"
Session 2: "Add validation schemas for each form step"
Session 3: "Build the API route for form submission"
Session 4: "Write tests for the form wizard"
Session 5: "Add error handling and loading states"
→ Each session completes fully within context
Each session has one clear goal. When it's done, I start fresh. The key insight is that starting a new conversation is free — there's no penalty for it. But running out of context mid-task? That's expensive.
The 15-Minute Rule
If I think a task will take Claude more than 15 minutes of back-and-forth, I break it up. This isn't a hard science — it's a gut check. But it's saved me countless blown sessions.
Strategy 2: Use /compact Like Your Life Depends On It
The /compact command is criminally underused. It summarizes your conversation history, compressing thousands of tokens into a tight summary. I use it:
- After finishing a logical subtask
- Before pivoting to a different part of the codebase
- When I notice the conversation getting long
- Basically every 10-15 minutes of active work
You: "Add the user authentication middleware"
Claude: *writes 80 lines of middleware code*
You: "Looks good. Now let's move to the dashboard."
You: /compact ← DO THIS BEFORE SWITCHING CONTEXT
Claude: *compresses conversation*
You: "Now build the dashboard API route"
The summary preserves what matters — decisions made, files changed, current state — while ditching the verbose back-and-forth that was eating your context alive.
Strategy 3: Stop Being Polite (Seriously)
This one hurts because my mom raised me right. But here's the truth: every "thank you," "looks great!", and "awesome, now can you..." costs tokens. Both your message and Claude's polite response.
❌ Token-expensive conversation:
You: "Can you update the header component?"
Claude: *updates it*
You: "Thanks so much! That looks really great.
I love how you handled the responsive part.
One small thing though — could you also
add the dark mode toggle?"
Claude: "Thank you for the kind words! I'm glad you
liked the responsive approach. I'd be happy
to add the dark mode toggle..."
Token cost: ~400 tokens on pleasantries alone
✅ Token-efficient conversation:
You: "Update the header component"
Claude: *updates it*
You: "Add dark mode toggle to the header"
Token cost: ~40 tokens
I'm not saying be rude. I'm saying save the gratitude for after the session. Claude doesn't have feelings (yet), and your context window doesn't care about your manners. Be direct, be specific, move fast.
Real Numbers
I tracked this for a week. In sessions where I was conversational, I averaged 45 minutes before hitting context limits. In sessions where I was direct and terse, I averaged 80+ minutes. Almost double the productive time from just cutting the small talk.
Strategy 4: Write a CLAUDE.md That Actually Works
Your CLAUDE.md file (or .claude/settings.json project instructions) is the single best investment you can make. It gives Claude instant context about your project without having to read dozens of files to figure things out.
Here's what I include in mine:
# Project: MyApp
## Architecture
- Next.js 15 App Router, TypeScript, Tailwind CSS
- Database: PostgreSQL with Drizzle ORM
- Auth: NextAuth.js with Google/GitHub providers
- State: Zustand for client, React Query for server
## Key Directories
- src/app/ — App Router pages and API routes
- src/components/ — Shared UI components
- src/lib/ — Utilities, database, auth config
- src/types/ — TypeScript type definitions
## Conventions
- Use server components by default, 'use client' only when needed
- API routes return { data, error } shape
- All forms use react-hook-form + zod validation
- Tests go in __tests__/ adjacent to source files
- Use cn() utility for conditional Tailwind classes
## Common Patterns
- Database queries go through src/lib/db/queries/
- Auth checks use getServerSession() in server components
- Error boundaries wrap each route segmentThis single file saves me thousands of tokens per session because Claude doesn't need to read half my codebase to understand the project structure and conventions. It just knows.
CLAUDE.md Pro Move
Include a "Current Work" section at the top of your CLAUDE.md that you update with what you're actively working on. When you start a new session, Claude immediately has context on what you were doing before — no need to re-explain.
Strategy 5: Structure Your Prompts for Efficiency
The way you write prompts has a massive impact on context usage. Here's my formula:
❌ Vague prompt (causes exploration, burns context):
"The login isn't working right, can you fix it?"
✅ Precise prompt (surgical, context-efficient):
"In src/app/auth/login/page.tsx, the form submission
handler (line ~45) isn't passing the redirect URL
to signIn(). Add the callbackUrl parameter from
the search params."
The precise prompt tells Claude exactly where to look and what to do. No exploration needed. No reading five files to understand the problem. Direct hit.
My prompt template for most tasks:
1. What file(s) to modify (exact paths)
2. What the current behavior is
3. What the desired behavior is
4. Any constraints or patterns to follow
This might feel over-specified, but trust me — the upfront effort of writing a clear prompt saves 10x the context that would be wasted on exploration and clarification.
Strategy 6: Use Subagents for Heavy Lifting
Claude Code can spawn subagents — essentially child processes that do work in their own context window and return just the results. This is a game-changer for context management.
When I need to explore a large codebase or do research-heavy tasks, I explicitly ask Claude to use a subagent:
"Use a subagent to find all files that import the
UserProfile component and list the file paths and
how they use it. Just give me the summary."
The subagent does all the file reading and searching in its own context. My main session only receives the compressed results. It's like hiring an intern to do the research and give you a one-page brief instead of dumping 50 pages of raw notes on your desk.
┌─────────────────────────────────────────┐
│ Main Session Context │
│ │
│ You ←──── Summary only ────┐ │
│ │ │
│ ┌─────────┴────────┐ │
│ │ Subagent │ │
│ │ │ │
│ │ Reads 20 files │ │
│ │ Runs searches │ │
│ │ Analyzes code │ │
│ │ │ │
│ │ (own context) │ │
│ └──────────────────┘ │
│ │
│ Main context: barely touched │
└─────────────────────────────────────────┘
Strategy 7: Know When to Start Fresh
This is the hardest lesson because of sunk cost fallacy. When your context is getting full, do not try to squeeze in "just one more thing." You'll get degraded responses, Claude will start forgetting earlier decisions, and the quality tanks.
My rules for starting a new conversation:
- The task is done — even if there's related work, start fresh
- You're switching domains — going from frontend to backend? New session
- Claude starts repeating itself — it's running low, bail out
- You've been going for 20+ minutes — consider a
/compactor a fresh start - The quality drops — if responses feel off, don't push it
Signs you should have started a new conversation 5 minutes ago:
☐ Claude suggests something you already discussed and rejected
☐ Responses are getting shorter and less detailed
☐ It's reading files it already read earlier
☐ You're re-explaining context you already gave
☐ The output quality just... feels worse
Starting fresh with a clear, specific prompt is almost always faster than limping along with an exhausted context window.
My Actual Workflow (A Real Example)
Let me walk you through how I recently built a notification system using these principles:
Session 1: Database Schema (8 minutes)
Prompt: "Create a notifications table in src/lib/db/schema.ts
following the existing pattern. Columns: id, userId, type
(enum: info/warning/error), title, message, read (boolean),
createdAt. Add the Drizzle migration."
Done. New session.
Session 2: API Routes (12 minutes)
Prompt: "Create CRUD API routes for notifications at
src/app/api/notifications/. Follow the existing API pattern
in src/app/api/users/. Include GET (with pagination), POST,
PATCH (mark as read), DELETE."
Done. New session.
Session 3: UI Components (15 minutes)
Prompt: "Build a NotificationBell component and NotificationPanel
dropdown. Follow the component patterns in src/components/ui/.
Use the notifications API from Session 2. Include unread count
badge and mark-as-read on click."
Used /compact once midway. Done. New session.
Session 4: Tests (10 minutes)
Prompt: "Write tests for the notification API routes and the
NotificationBell component. Follow test patterns in
__tests__/api/ and __tests__/components/."
Done.
Total time: ~45 minutes across 4 sessions. Each session completed fully. No context blowups. No lost work. No frustration.
If I'd tried this in one session? I'd have hit the wall somewhere in Session 3 and lost everything from Sessions 1-2 that Claude was holding in context.
The Token Budget Mindset
I now think about context like a budget. Every session, I have a fixed amount to spend:
| Action | Token Cost | Worth It? |
|---|---|---|
| Clear, specific prompt | ~100 tokens | Always |
| Reading a file | ~500-2000 tokens | Only if needed |
| "Thanks, looks great!" | ~50 tokens | Never during session |
| Vague prompt requiring exploration | ~3000+ tokens | Rarely |
| /compact | Negative (saves tokens) | Almost always |
| Re-explaining what you want | ~200+ tokens | Avoid with better prompts |
| Asking Claude to explain its code | ~500+ tokens | Only if confused |
The developers who get the most out of Claude Code are the ones who treat every token like it costs money. Because in a way, it does — it costs you time and productivity when the context runs out.
The Biggest Waste I See
Developers who paste entire error logs, full stack traces, or complete file contents when only a few relevant lines would do. If you have a 200-line error log, find the relevant error and paste those 5 lines. Your context window will thank you.
Quick Reference Cheat Sheet
┌─────────────────────────────────────────────────────────┐
│ CONTEXT SURVIVAL CHEAT SHEET │
├─────────────────────────────────────────────────────────┤
│ │
│ BEFORE the session: │
│ ✓ Write/update CLAUDE.md │
│ ✓ Plan your task breakdown │
│ ✓ Identify the specific files you'll touch │
│ ✓ Write your first prompt before starting │
│ │
│ DURING the session: │
│ ✓ Be direct — skip pleasantries │
│ ✓ Give exact file paths in prompts │
│ ✓ Use /compact every 10-15 minutes │
│ ✓ One task per session │
│ ✓ Don't paste full files/logs — excerpt them │
│ │
│ WHEN TO BAIL: │
│ ✓ Task is complete (even partially) │
│ ✓ Switching to a different area of code │
│ ✓ Quality of responses is declining │
│ ✓ Claude is repeating itself │
│ │
│ AFTER the session: │
│ ✓ Update CLAUDE.md "Current Work" section │
│ ✓ Note what's done and what's next │
│ ✓ Start a fresh session for the next task │
│ ✓ NOW you can say thank you :) │
│ │
└─────────────────────────────────────────────────────────┘
It Gets Better With Practice
When I first started managing context deliberately, it felt annoying. Like, I'm already learning a new tool and now I have to think about token budgets? But after a week or two it became second nature. Just like you don't think about saving a file anymore (Cmd+S is muscle memory), context management becomes instinctive.
And the payoff is real. I went from losing work to context limits multiple times a day to maybe hitting the limit once a week — and even then, it's at the natural end of a task, not mid-sentence in a critical refactor.
The irony isn't lost on me: I'm a PhD in Biomedical Engineering writing about how to efficiently communicate with a language model. But honestly, the skills transfer more than you'd think. Research taught me to be methodical, to plan before executing, and to never trust that you'll have unlimited resources. Turns out those same principles apply to context windows.
Now if you'll excuse me, I have a CLAUDE.md to update.
Struggling with Claude Code context limits or AI-assisted development workflows? Let's chat — I've probably hit every wall there is and I'm happy to share the dents.
Frequently Asked Questions
Don't miss a post
Articles on AI, engineering, and lessons I learn building things. No spam, I promise.
Osvaldo Restrepo
Senior Full Stack AI & Software Engineer. Building production AI systems that solve real problems.