Lesson 79 of 2116
Claude Opus 4.7 — when extended thinking earns its cost
Opus 4.7 shipped in April 2026 with a bigger thinking budget and a 1M-token window at standard prices. Here is the architecture, the pricing math, and when the premium is actually worth it.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1What Opus 4.7 actually changed
- 2Extended thinking is the key feature
- 3A production-grade call
- 4The cost model you need to internalize
Concept cluster
Terms to connect while reading
Section 1
What Opus 4.7 actually changed
Anthropic released Claude Opus 4.7 on April 16, 2026. The headline was not a new benchmark — it was that Opus, which had cost $15/$75 per million tokens on earlier generations, now ships at $5 in / $25 out with a 1M token context window. That is standard Sonnet-tier pricing for a flagship model, and it changes which tasks are cost-justified on Opus.
Section 2
Extended thinking is the key feature
When you enable extended thinking, Claude spends hidden reasoning tokens before emitting the answer. You set a budget; Claude spends up to that budget. The bigger the budget, the deeper the search over approaches. Opus 4.7 handles budgets of tens of thousands of tokens gracefully; Sonnet starts to lose coherence at that scale.
Compare the options
| Dimension | Opus 4.7 without thinking | Opus 4.7 with thinking (8k budget) | Opus 4.7 with thinking (32k budget) |
|---|---|---|---|
| Latency | Similar to Sonnet | +10-20s | +1-3 minutes |
| Input cost impact | Same | Same | Same (input is just the prompt) |
| Output cost impact | Normal | +8k hidden tokens billed | +32k hidden tokens billed |
| Quality lift | Baseline | Meaningful on hard math/code | Decisive on research-grade problems |
| Right for | Everyday work, summarization | Complex debugging, multi-file refactor | Scientific reasoning, long-horizon agents |
Where extended thinking is decisive
- Multi-step mathematical proofs where every step has to be verified
- Debugging code that involves invariants across 5+ files — the thinking token phase lets Claude trace the dependency graph
- Legal or policy analysis where multiple clauses interact and the reasoning has to be auditable
- Planning agentic work where the wrong initial plan costs hours of execution
Where it is wasted
- Summarization (the model already knows how)
- Simple Q&A with known facts
- Creative writing — thinking tokens tend to make the final text worse by over-constraining
- Anything where you would accept the first plausible answer
Section 3
A production-grade call
The thinking block is part of the response but typically not shown to the end user. You can log it for audit, but never paste it back as context — it is one-shot reasoning.
from anthropic import Anthropic
client = Anthropic()
resp = client.messages.create(
model="claude-opus-4-7",
max_tokens=8192,
thinking={
"type": "enabled",
"budget_tokens": 16000
},
messages=[
{"role": "user", "content": "Given the attached 400-page regulatory filing, identify every section that conflicts with the EU AI Act provisions on high-risk systems."}
]
)
# Inspect what Claude was thinking (optional, advanced)
for block in resp.content:
if block.type == "thinking":
pass # hidden reasoning, not shown to end user
elif block.type == "text":
print(block.text)
print(resp.usage) # input_tokens, output_tokens, thinking_tokens separateSection 4
The cost model you need to internalize
- Input: $5 per million tokens (cached: ~$0.50)
- Output: $25 per million tokens (includes thinking)
- Extended thinking with 16k budget ≈ $0.40 of output cost per call baseline
- Prompt caching makes repeat calls over long documents economically viable
- At small call volume, Opus with thinking is usually cheaper than failing and retrying on a smaller model
“The right question is not 'can I afford Opus with extended thinking?' — it is 'can I afford to be wrong on this task?'”
Key terms in this lesson
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Claude Opus 4.7 — when extended thinking earns its cost”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 30 min
GPT-5.5 vs. Claude Opus 4.7 — which chatbot wins your day
Two frontier models, same subscription price, very different personalities. Pick by vibe, not by benchmark — here is how to figure out which one clicks for you.
Builders · 28 min
Claude Opus 4.7 — extended thinking cost math
Extended thinking makes Opus smarter but burns hidden tokens. Here is how to budget it without blowing your bill.
Builders · 28 min
ElevenLabs v3 — voice cloning without causing a disaster
ElevenLabs voices are indistinguishable from humans. That is a feature and a fraud vector. Here is the production checklist before you clone anyone.
