Lesson 71 of 1570
Gemini 2.5 Pro — how a 1M context actually helps
Everyone brags about million-token windows. Here is what you can actually do with one when you learn how Gemini 2.5 Pro handles long documents.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1A million tokens is a lot of text
- 2Real jobs a million tokens enables
- 3A real prompt that uses the whole window
Concept cluster
Terms to connect while reading
Section 1
A million tokens is a lot of text
A million tokens is roughly 750,000 words — the entire Lord of the Rings trilogy plus the Hobbit, with room to spare. Gemini 2.5 Pro holds that in working memory at $1 in and $10 out per million tokens. That is cheap enough to actually use. The question is: what do you do with it?
Section 2
Real jobs a million tokens enables
Compare the options
| Use case | What fits in 1M tokens | Why Gemini 2.5 Pro nails it |
|---|---|---|
| Whole-codebase analysis | ~50,000 lines of code plus tests | Keeps import graph coherent, finds cross-file bugs |
| Hour-long video meeting | Full transcript + slides + chat log | Native multimodal — no separate transcription step |
| Research literature review | 40-60 academic papers side by side | Can cite which paper claimed what |
| Legal discovery | Thousands of emails or a 500-page contract set | Tracks parties, dates, clauses across the corpus |
| Book-length editing pass | Full 80,000-word novel draft | Line edits that stay consistent with chapter 1 while editing chapter 30 |
The trap: you can dump too much
Just because it all fits does not mean you should paste it all. Every token costs money going in and distracts the model on the way out. If the answer is in chapter 3, do not send chapters 1-20.
- Chunk first, summarize, then send just the relevant chunks.
- For true long-context tasks (finding a pattern across the whole thing) send the whole thing but be specific about what you need.
- Use Gemini's grounding (search) for facts the model could not possibly know.
Section 3
A real prompt that uses the whole window
One API call, one codebase, one honest audit. That is the 1M-token pitch.
import google.generativeai as genai
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
model = genai.GenerativeModel("gemini-2.5-pro")
with open("full_codebase_dump.txt", "r") as f:
codebase = f.read() # ~400k tokens of Python
resp = model.generate_content(
[
codebase,
"Find every place where user input reaches the database without validation. "
"Give me file:line and the risk severity."
],
generation_config={"temperature": 0.1}
)
print(resp.text)Key terms in this lesson
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Gemini 2.5 Pro — how a 1M context actually helps”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 24 min
Perplexity Sonar — when search-first beats raw reasoning
Every LLM hallucinates. Perplexity's Sonar family solves it by grounding answers in live web results with citations. Here is when to use Sonar instead of Claude or GPT.
Builders · 28 min
Codestral Mamba — state-space architecture
Codestral Mamba ditches transformers for a state-space model. The result: linear-time long-context coding at a fraction of the attention cost.
Builders · 26 min
Kimi K2 — long-context workflow
Moonshot's Kimi K2 specializes in long documents and retrieval-heavy workflows. Here is when it beats a generalist.
