Lesson 440 of 1570
Why AI 'Forgets' Halfway Through a Long Chat
AI has a memory limit called the context window. Hitting it explains a LOT of weird behavior.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1Why AI 'Forgets' Halfway Through a Long Chat
- 2AI and Why ChatGPT Forgets the Top of the Conversation
- 3The big idea
- 4AI and context window limits: why long chats forget the start
Concept cluster
Terms to connect while reading
Section 1
Why AI 'Forgets' Halfway Through a Long Chat
AI has a memory limit called the context window. Hitting it explains a LOT of weird behavior.
What to actually do
- GPT-5 class models hold roughly 200k–1M tokens (a novel)
- Older or free models are much smaller (8k–32k)
- When you hit the limit, the start of the chat literally disappears from memory
Key terms in this lesson
The big idea: AI has a memory limit. Knowing where it is keeps you from being surprised when it forgets.
Section 2
AI and Why ChatGPT Forgets the Top of the Conversation
Section 3
The big idea
AI models can only 'see' a fixed number of tokens at once — that's the context window. When chats get long, the AI literally forgets the top.
Some examples
- GPT-4o handles about 128K tokens — roughly a short novel.
- Claude can handle around 200K tokens.
- When you say 'as I mentioned earlier,' the AI may not remember.
- Long chats run out of memory; start a new one for new tasks.
Try it!
Open your longest ChatGPT thread and ask 'what was the very first thing I asked you in this chat?' — see if it gets it right.
Section 4
AI and context window limits: why long chats forget the start
Section 5
The big idea
AI has a limited context window — usually 100k to 1M tokens. Once you hit it, the model starts forgetting the early part of your chat. Knowing this changes how you structure long projects.
How to use it
- Ask AI for the context window of the model you're using
- Start a new chat for unrelated topics to save context
- Ask AI to summarize your chat so far to refresh memory
- Use Projects or Memory features to persist info across chats
Try it
In a long chat, ask AI to summarize your conversation so far. Then ask 'what was my first question' and check accuracy.
Section 6
Context Windows in 2026: Why Claude 1M Beats GPT 128K for Big Projects
Section 7
The big idea
The context window is how much the AI can 'remember' in one conversation. In 2026 the spread between models is huge, and choosing wrong wastes hours.
Some examples
- 1M tokens ≈ 750,000 words ≈ 7 novels
- Use Claude 1M for: whole codebases, every chapter of one book
- Use GPT-4o 128K for: a long paper, a chat history
- Gemini 2.5 Pro: 2M tokens, even bigger but slower
Try it!
Pick your biggest text source (a long PDF). Try summarizing it in ChatGPT first. If it errors on length, retry in Claude. Note the difference.
Section 8
Why ChatGPT Forgets the Start of Your Conversation (Context Windows Explained)
Section 9
The big idea
Every LLM has a 'context window' — a maximum number of tokens (roughly: word-pieces) it can hold in its short-term memory at once. When the conversation runs longer than that, the oldest messages drop off and the AI 'forgets' them. Modern models range from 8K tokens (very small) to 1M+ tokens (Gemini, Claude Sonnet 4.5). Knowing your model's window prevents the classic bug: 'wait, I told you that an hour ago, why don't you remember?' For long projects, use a model with a big window or save context in a 'system message' or Project.
Some examples
- 1 token ≈ 0.75 words in English — so a 100K token window holds roughly 75,000 words, about a 250-page book.
- Free ChatGPT (GPT-4o-mini) is 128K tokens; Claude Sonnet 4.5 is 1M; Gemini 2.5 Pro is 2M. Long-doc tasks favor Claude or Gemini.
- ChatGPT's 'Memory' feature is separate — it stores facts about you across chats but doesn't expand the context window of any single chat.
- In Claude Projects or ChatGPT Custom GPTs, you can pin instructions and files that persist — a workaround for context-window limits.
Try it!
Try this in ChatGPT: paste a long article (5,000+ words), ask 5 questions about it, then ask 'what was the second paragraph about?' If it stumbles, you just hit the context-window edge.
Section 10
Context Windows: Why AI Forgets
Section 11
The big idea
Every AI model has a 'context window' — the amount of text it can hold in attention at once. When you go over, the model literally forgets the start of your conversation or document. Knowing the limit, and how to manage what fills it, separates power users from frustrated ones.
Some examples
- GPT-4 turbo: 128k tokens. Claude: up to 1M. Gemini: up to 2M. Your prompt eats from this budget.
- Long conversations gradually 'lose' the early instructions — restate them when needed.
- For long documents, summarize sections rather than pasting everything raw.
- Persistent 'memory' features add specific facts back into context every time.
Try it!
Look up the context window of the AI tool you use most. Plan your next big project around it.
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Why AI 'Forgets' Halfway Through a Long Chat”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 40 min
What a Token Actually Is (And Why It Matters for Your Prompts)
AI doesn't read words — it reads tokens. Knowing the difference makes you a better prompter.
Builders · 40 min
AI and tokens vs words: why your prompt costs what it costs
Learn what a token actually is so you can predict cost and context limits.
Explorers · 40 min
Why AI Forgets the Start of a Long Chat
AI has a memory limit for how much of a chat it can remember at once.
