Lesson 71 of 1455
Gemini 2.5 Pro — how a 1M context actually helps
Everyone brags about million-token windows. Here is what you can actually do with one when you learn how Gemini 2.5 Pro handles long documents.
Builders · Model Families · ~17 min read
A million tokens is a lot of text
A million tokens is roughly 750,000 words — the entire Lord of the Rings trilogy plus the Hobbit, with room to spare. Gemini 2.5 Pro holds that in working memory at $1 in and $10 out per million tokens. That is cheap enough to actually use. The question is: what do you do with it?
Real jobs a million tokens enables
Compare the options
| Use case | What fits in 1M tokens | Why Gemini 2.5 Pro nails it |
|---|---|---|
| Whole-codebase analysis | ~50,000 lines of code plus tests | Keeps import graph coherent, finds cross-file bugs |
| Hour-long video meeting | Full transcript + slides + chat log | Native multimodal — no separate transcription step |
| Research literature review | 40-60 academic papers side by side | Can cite which paper claimed what |
| Legal discovery | Thousands of emails or a 500-page contract set | Tracks parties, dates, clauses across the corpus |
| Book-length editing pass | Full 80,000-word novel draft | Line edits that stay consistent with chapter 1 while editing chapter 30 |
The trap: you can dump too much
Just because it all fits does not mean you should paste it all. Every token costs money going in and distracts the model on the way out. If the answer is in chapter 3, do not send chapters 1-20.
- Chunk first, summarize, then send just the relevant chunks.
- For true long-context tasks (finding a pattern across the whole thing) send the whole thing but be specific about what you need.
- Use Gemini's grounding (search) for facts the model could not possibly know.
A real prompt that uses the whole window
One API call, one codebase, one honest audit. That is the 1M-token pitch.
import google.generativeai as genai genai.configure(api_key=os.environ["GEMINI_API_KEY"]) model = genai.GenerativeModel("gemini-2.5-pro") with open("full_codebase_dump.txt", "r") as f: codebase = f.read() # ~400k tokens of Python resp = model.generate_content( [ codebase, "Find every place where user input reaches the database without validation. " "Give me file:line and the risk severity." ], generation_config={"temperature": 0.1} ) print(resp.text)Key terms in this lesson
End-of-lesson quiz
Check what stuck
8 questions · Score saves to your progress.
Lesson help
Questions are best handled with a grown-up here.
For this age range, Tendril keeps freeform AI chat paused until parent/guardian consent and child-safe moderation are fully verified. Use the quiz, notes, and related lessons below, or ask a parent, guardian, teacher, or librarian to work through the question with you.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 24 min
Perplexity Sonar — when search-first beats raw reasoning
Every LLM hallucinates. Perplexity's Sonar family solves it by grounding answers in live web results with citations. Here is when to use Sonar instead of Claude or GPT.
Builders · 28 min
Codestral Mamba — state-space architecture
Codestral Mamba ditches transformers for a state-space model. The result: linear-time long-context coding at a fraction of the attention cost.
Builders · 26 min
Kimi K2 — long-context workflow
Moonshot's Kimi K2 specializes in long documents and retrieval-heavy workflows. Here is when it beats a generalist.
