Lesson 88 of 1570
Llama 4 Scout vs. Maverick
Meta's Llama 4 family splits into Scout (lean) and Maverick (flagship). Here is how to choose between them for self-hosted work.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1Two open-weight siblings
- 2Llama 4 Scout
- 3Llama 4 Maverick
- 4open weights
Concept cluster
Terms to connect while reading
Section 1
Two open-weight siblings
Llama 4 Scout is the compact sibling — cheap to host, fast per token, strong on mainstream tasks. Maverick is the flagship — wider mixture-of-experts, stronger reasoning, bigger GPU bill. Both ship under Meta's community license.
Compare the options
| Aspect | Llama 4 Scout | Llama 4 Maverick |
|---|---|---|
| Active params | Smaller MoE | Larger MoE |
| GPU footprint | Fits on 1x H100 inference | Multi-GPU |
| Quality tier | Sonnet-class | Near-frontier |
| Cost per M (hosted) | $ | $$ |
| Best for | RAG, chat, agents at scale | Complex reasoning, code |
Pick Scout when
- You serve high QPS on a single-box budget
- Latency matters more than peak quality
- You are fine-tuning for a narrow domain
Pick Maverick when
- Quality matches or beats frontier APIs for your eval
- You have multi-GPU capacity or use Together/Fireworks/Bedrock
- Data residency rules forbid public APIs
Scout runs locally on a decent workstation GPU. Maverick usually does not.
ollama pull llama4:scout
ollama run llama4:scout "Summarize this support ticket"Key terms in this lesson
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Llama 4 Scout vs. Maverick”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 26 min
DeepSeek V3.5 coding
DeepSeek V3.5 is the open-weights model that keeps punching above its weight class on coding benchmarks at a fraction of the cost.
Builders · 28 min
DeepSeek R1 reasoning open-weights
R1 was the open-weights reasoning shock of early 2025. A year later it is still the default for anyone who needs o-series reasoning without paying o-series prices.
Builders · 26 min
Qwen 3 Max — Chinese-English multilingual
Alibaba's Qwen 3 Max is the leading open-weights model for high-quality Chinese work and does English surprisingly well.
