Context Engineering
Last reviewed
AdvancedWhat you'll learn
~20 min- Explain context rot and budget a session by window fill
- Structure CLAUDE.md / AGENTS.md memory files that actually constrain behavior
- Use compaction, just-in-time retrieval, and note-taking to keep week-long projects coherent
It is Tuesday afternoon. You open a new session and ask the agent to extend the feature you built Monday. It generates something that contradicts the architecture decision you made two days ago — the one you spent 40 minutes deliberating over in last Friday’s planning session.
The agent is not stupid. It never saw that decision. Your context window expired.
That is context rot. And fixing it is not about writing better prompts — it is about engineering the information that reaches the model.
What context engineering actually is
Anthropic defines context engineering as curating the smallest possible set of high-signal tokens for each agent call. Not the most complete record of everything that ever happened. Not a wall of documentation preloaded at session start. The minimum set needed to do the next unit of work correctly.
This matters because every frontier model degrades as the window fills. Chroma’s research confirmed it across models: quality drops measurably as context grows, well before you hit the token ceiling. More tokens do not mean better answers — past a certain point they mean worse ones.
The practical corollary: a focused 30K-token session will outperform a bloated 200K one on the same task.
The Vibe Coding Reference already covers the 60-70% effective-context rule and the /clear vs /compact decision table — bookmark that cheat-sheet and come back to it. This lesson explains why degradation happens and goes deeper on the four levers that prevent it.
The four levers
1. Compaction
When a session accumulates dead weight — failed approaches, long error traces, exploratory tangents — compaction distills the thread into a summary the model can reason from efficiently.
/compactRun it at natural boundaries: after a feature lands and is committed, before shifting to a different area of the codebase, when you notice the model starting to contradict itself. The goal is to compact after work is safe in git — compaction is lossy by design, and the commits are your real record.
Claude Code’s auto-compact triggers near the window limit. Getting ahead of it manually keeps you in control of what gets preserved.
2. Just-in-time retrieval
Preloading every file that might be relevant at session start is a tax you pay in tokens up front, on things the agent may never use.
Instead, reference files only when the agent actually needs them:
@src/services/billing.ts -- add a refund method following the existing patterns in this fileThe @-reference pulls the file into context precisely when the agent needs it, then it can be compacted or cleared on the next boundary. This is especially valuable in large repos: let the agent navigate to what it needs rather than flooding it with your entire src/ tree.
3. Structured note-taking
For work that spans multiple sessions, the agent’s in-memory state is useless the moment you close the terminal. What persists is what you wrote down.
Before you compact or end a session, tell the agent to record what it learned:
Before we compact: summarize the three decisions we made today aboutthe data model, and append them to CLAUDE.md under ## Session Notes.This gives you a persistent, human-readable audit trail that the agent re-reads on the next open. Plan files work the same way — a spec the agent references at the start of each session beats re-explaining your architecture from scratch every morning. (More on spec-driven workflows in Lesson 7: Spec-Driven Development.)
Claude Code also maintains a per-project auto-memory (a MEMORY.md index plus topic files) that persists across sessions. Run /memory to inspect and edit it.
4. Isolation via subagents
Sometimes the best way to protect your main context window is to not use it at all for a subtask.
Delegate exploratory research, dependency audits, or large-file analysis to a subagent. Its dead-ends, verbose output, and half-formed conclusions stay in its own window — your main session only sees the distilled result. This is covered in depth in Lesson 5: Subagents and Orchestration.
Memory files that actually work
The four levers above assume the agent has a baseline to work from. That baseline is your memory file. Here is what separates one that constrains behavior from one that evaporates.
The good: constraints the model can act on
# Project: payment-service
## Build & verifynpm run build # must succeed, zero errorsnpm run test:unit # run after every schema change
## Conventions- All monetary values in cents (integer), never floats- API responses: { data, error, meta } envelope — no exceptions- New endpoints: add to openapi.yaml before implementation- TypeScript strict mode; no `any`
## Never touch- src/legacy/v1-compat.ts — frozen, customer dependency- db/migrations/ — append only, never edit existing migrations
## Verification stepAfter any auth change: npm run test:auth && check /health endpointFifteen lines. Every line is actionable. The model can check itself against each one.
The bad: aspirational prose
# Guidelines
This project values clean, readable, maintainable code that followsbest practices and industry standards. We aim for high test coverageand prefer well-documented solutions. The team uses a collaborativeapproach and code quality is important to us. We follow the principleof least surprise and try to write code that will be easy for futuredevelopers to understand and modify...The model reads this, finds nothing to constrain a specific decision, and falls back on its priors. Wall-of-prose memory files do not fail loudly — they just silently do nothing.
The test: For each line in your memory file, ask “could the agent make a concrete decision based only on this line?” If the answer is no, cut or rewrite it.
The memory file hierarchy
| Tool | Memory file | Scope | How to create/edit |
|---|---|---|---|
| Claude Code | ~/.claude/CLAUDE.md | User-level — all projects | /memory |
| Claude Code | ./CLAUDE.md | Project — committed, team-shared | /init or manually |
| Codex CLI | ./AGENTS.md | Project | Manually |
| Antigravity CLI | ./AGENTS.md | Project (takes precedence over GEMINI.md) | Manually |
| GitHub Copilot | .github/copilot-instructions.md | Repo-wide | Manually |
AGENTS.md is the cross-tool standard. If you work across Claude Code, Codex CLI, and Antigravity, a single AGENTS.md at the repo root travels with the codebase regardless of which tool your teammates use. Layer tool-specific overrides (like Claude’s ./CLAUDE.md) on top for anything that only applies to one tool.
~/.claude/CLAUDE.md holds preferences that apply everywhere — your preferred commit style, general conventions, how you like errors explained. The project ./CLAUDE.md holds project-specific constraints — build commands, frozen files, naming rules. Keep them distinct. Mixing project constraints into your user file means they apply everywhere; mixing personal preferences into the project file means they get committed and applied to everyone on the team.
Budgeting a session
Use /context to get a current breakdown of window usage — how many tokens are consumed, what is contributing, and how much headroom remains.
A practical session budget:
- Start clean.
/clearbetween unrelated tasks. Yesterday’s e-commerce session is dead weight when you open a new billing task today. - Watch the fill %. When you cross ~50-60%, you are approaching the zone where quality starts to degrade. Do not wait for auto-compact.
- Compact at natural boundaries. Finish the feature, commit the work, then
/compact. The summary that compaction produces is higher-signal than the raw conversation it replaced. - Clear when you switch domains. Starting a completely different area of the codebase?
/clearand let the agent re-read what it needs via@-references and memory files.
The Vibe Coding Reference decision table has the full clear-vs-compact-vs-keep-going matrix — use it as your quick reference.
Claude Code’s /btw command lets you ask a throwaway question that sees current context but does not grow the window. Good for quick clarifications mid-task without polluting the thread.
Notes and artifacts as durable memory
The mental shift: stop thinking of the conversation as your project memory. The conversation is ephemeral. These are durable:
CLAUDE.md/AGENTS.md— behavioral constraints, always present- Plan files (
~/.claude/plans/) — the spec the agent re-reads each session - Session notes — decisions appended to memory files before compaction
- Git history — the permanent record of what was built and why (commit messages matter)
A practical end-of-session prompt before you compact:
We're about to compact. Before we do:1. What three decisions did we make today that the next session needs to know?2. Are there any constraints we discovered that should be added to CLAUDE.md?Write both to CLAUDE.md under ## Session Notes with today's date, then summarize so I can verify.This 30-second habit is the difference between a week-long project that stays coherent and one that re-litigates the same architecture decisions every other day.
When Things Go Wrong
Use the Symptom → Evidence → Request pattern: describe what you see, paste the error, then ask for a fix.
You are 90 minutes into a session building a new auth module. The agent starts generating code that forgets the JWT expiry rules you established at the start. What is the most likely cause?
You finish implementing a complex caching layer. The session has been running for two hours. You want to start work on a completely separate rate-limiting module. What is the right sequence?
Which of the following belongs in a project CLAUDE.md?
Key takeaways
- Context rot is real and structural. Quality degrades as the window fills. The fix is engineering, not better prompts.
- Four levers: compaction (distill after committing), just-in-time retrieval (pull files when needed, not all upfront), note-taking (write decisions to memory files before compacting), and subagent isolation (keep exploratory work out of your main window).
- Memory files constrain; prose files evaporate. Every line should pass the “can the agent make a specific decision from this alone?” test.
AGENTS.mdtravels. It is honored by Claude Code, Codex CLI, and Antigravity CLI — one file, three tools.- Session budget: watch
/contextfill %, compact at natural boundaries after committing, clear when switching tasks.
Next up: Lesson 3: MCP Servers — how the Model Context Protocol extends what the agent can see and do beyond the conversation window.