Context Rot — Why Long Sessions Get Stupid

Long agent sessions degrade in predictable ways. Learn what context rot looks like, why it happens even with million-token windows, and the compaction discipline that keeps quality high.

12 min · Reviewed 2026

Bigger Window, Same Brain

Claude 4.7 has a million-token window. GPT-5 has 400k. You can hand them an entire codebase. They will read it. They will also forget half of it by the time they finish writing the answer. Long context is necessary, not sufficient — and it actively degrades reasoning quality past a point.

The long-context performance curve

Context used	Typical behavior	Quality
0-30k tokens	Sharp, follows instructions, recalls everything	Best work
30k-100k	Slight blurring of details from early turns	Still strong
100k-300k	Mistakes earlier file contents, makes up function names	Mixed
300k-1M	Confidently wrong about half of what you fed in	Roll the dice

Symptoms of context rot

The agent re-asks for a file you pasted 20 turns ago
It contradicts a constraint you set at the start of the session
It mixes up function names from two different files you discussed
It claims a test is passing when you pasted the failure 5 minutes earlier
It replies more verbosely as the session lengthens (a tell that it is uncertain)

Why it happens

Attention costs scale roughly quadratically with context length. To stay tractable, models use various sparse attention tricks. These work great in benchmarks ("find the needle in the haystack") but degrade in real reasoning where you need to combine ten facts spread across the haystack. The model can still find each one. Combining them is harder.

Compaction: deliberately throwing away

# Mid-session compaction prompt for Claude Code, Cursor, Codex "Pause. Summarize this session into a working brief: 1. What is the goal? (one sentence) 2. What constraints have we agreed on? (bullets) 3. What files have we touched and how? (file -> change) 4. What is the current bug or open question? 5. What should NOT be touched? Return only the brief. I'll start a fresh session with it."The model writes its own handoff note. Then you start fresh and paste it in.

Persistent memory beats long context

A 200-line CLAUDE.md, AGENTS.md, or `.cursor/rules` file at the project root is read on every session. It survives compaction. It survives session resets. Anything you find yourself repeating to the agent — conventions, scripts, gotchas — belongs in that file, not in the context window.

# Session hygiene checklist (run mentally, every 30 min) 1. Has the agent forgotten any earlier constraint? -> compact + re-state it. 2. Are we still on the original goal? -> re-state goal, drop tangents. 3. Has the file state diverged from what the agent thinks it is? -> show `git status`, force re-read. 4. Is the agent repeating itself? -> compact and reset. 5. Have we been at this >1 hour with no commit? -> hard reset.A 30-second check that prevents most multi-hour disasters.

When NOT to compact

You're 90% done with a delicate refactor — one more push beats a reset
The session is below 50k tokens and the agent is still sharp
You're debugging a bug that requires the full trace history to understand

Long context is a runway, not a destination. Land before you crash.
— An agentic systems engineer

The big idea: bigger context windows are a tool, not a free pass. Treat every session as a perishable resource. Compact aggressively, persist conventions in project-memory files, and start fresh when the agent's recall starts to slip. The work you save is your own.

End-of-lesson check

8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-coding-debug-context-rot-creators

What is the main idea of "Context Rot — Why Long Sessions Get Stupid"?
1. Long agent sessions degrade in predictable ways.
2. Use AI as the final authority for the whole decision
3. Avoid checking the answer once it sounds polished
4. Focus only on speed instead of judgment
Which concept is most central to "Context Rot — Why Long Sessions Get Stupid"?
1. compaction
2. context rot
3. context window
4. recency bias
Which use of AI fits this topic best?
1. Let the AI decide what matters without your review
2. Use the answer before checking whether it fits the situation
3. The agent re-asks for a file you pasted 20 turns ago
4. Treat the AI output as automatically correct
What should a careful learner remember about "Recency bias is real"?
1. Use AI to draft or organize ideas about context rot, then verify before acting.
2. Skip the context so the tool can guess faster
3. Treat the output as private even after sharing it online
4. Use the answer without checking the source
You want to use AI after this lesson. What is the safest next step?
1. Act immediately because the AI answer is written clearly
2. Use AI for drafting and comparison, but verify before publishing or relying on it.
3. Hide uncertainty so the final answer looks cleaner
4. Use private or sensitive details before checking permission
How should AI output about context rot be treated?
1. As proof that no other source is needed
2. As a replacement for context, consent, or expert review
3. As a draft or helper output that still needs human judgment and verification
4. As something that becomes correct when it sounds confident
Name one way to verify an AI answer about context rot.
Which action would help you apply "Context Rot — Why Long Sessions Get Stupid" responsibly?
1. Use the tool to avoid thinking through the tradeoff
2. Keep going even if the output conflicts with a trusted source
3. Treat the AI output as automatically correct
4. It contradicts a constraint you set at the start of the session

← Back to interactive lesson

Tendril · Creators · AI-Assisted Coding

Context Rot — Why Long Sessions Get Stupid

Long agent sessions degrade in predictable ways. Learn what context rot looks like, why it happens even with million-token windows, and the compaction discipline that keeps quality high.

12 min · Reviewed 2026

Bigger Window, Same Brain

The long-context performance curve

Context used	Typical behavior	Quality
0-30k tokens	Sharp, follows instructions, recalls everything	Best work
30k-100k	Slight blurring of details from early turns	Still strong
100k-300k	Mistakes earlier file contents, makes up function names	Mixed
300k-1M	Confidently wrong about half of what you fed in	Roll the dice

Symptoms of context rot

The agent re-asks for a file you pasted 20 turns ago
It contradicts a constraint you set at the start of the session
It mixes up function names from two different files you discussed
It claims a test is passing when you pasted the failure 5 minutes earlier
It replies more verbosely as the session lengthens (a tell that it is uncertain)

Why it happens

Compaction: deliberately throwing away

# Mid-session compaction prompt for Claude Code, Cursor, Codex "Pause. Summarize this session into a working brief: 1. What is the goal? (one sentence) 2. What constraints have we agreed on? (bullets) 3. What files have we touched and how? (file -> change) 4. What is the current bug or open question? 5. What should NOT be touched? Return only the brief. I'll start a fresh session with it."The model writes its own handoff note. Then you start fresh and paste it in.

Persistent memory beats long context

# Session hygiene checklist (run mentally, every 30 min) 1. Has the agent forgotten any earlier constraint? -> compact + re-state it. 2. Are we still on the original goal? -> re-state goal, drop tangents. 3. Has the file state diverged from what the agent thinks it is? -> show `git status`, force re-read. 4. Is the agent repeating itself? -> compact and reset. 5. Have we been at this >1 hour with no commit? -> hard reset.A 30-second check that prevents most multi-hour disasters.

When NOT to compact

You're 90% done with a delicate refactor — one more push beats a reset
The session is below 50k tokens and the agent is still sharp
You're debugging a bug that requires the full trace history to understand

Long context is a runway, not a destination. Land before you crash.
— An agentic systems engineer

End-of-lesson check

8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-coding-debug-context-rot-creators

What is the main idea of "Context Rot — Why Long Sessions Get Stupid"?
1. Long agent sessions degrade in predictable ways.
2. Use AI as the final authority for the whole decision
3. Avoid checking the answer once it sounds polished
4. Focus only on speed instead of judgment
Which concept is most central to "Context Rot — Why Long Sessions Get Stupid"?
1. compaction
2. context rot
3. context window
4. recency bias
Which use of AI fits this topic best?
1. Let the AI decide what matters without your review
2. Use the answer before checking whether it fits the situation
3. The agent re-asks for a file you pasted 20 turns ago
4. Treat the AI output as automatically correct
What should a careful learner remember about "Recency bias is real"?
1. Use AI to draft or organize ideas about context rot, then verify before acting.
2. Skip the context so the tool can guess faster
3. Treat the output as private even after sharing it online
4. Use the answer without checking the source
You want to use AI after this lesson. What is the safest next step?
1. Act immediately because the AI answer is written clearly
2. Use AI for drafting and comparison, but verify before publishing or relying on it.
3. Hide uncertainty so the final answer looks cleaner
4. Use private or sensitive details before checking permission
How should AI output about context rot be treated?
1. As proof that no other source is needed
2. As a replacement for context, consent, or expert review
3. As a draft or helper output that still needs human judgment and verification
4. As something that becomes correct when it sounds confident
Name one way to verify an AI answer about context rot.
Which action would help you apply "Context Rot — Why Long Sessions Get Stupid" responsibly?
1. Use the tool to avoid thinking through the tradeoff
2. Keep going even if the output conflicts with a trusted source
3. Treat the AI output as automatically correct
4. It contradicts a constraint you set at the start of the session

← Back to interactive lesson