Lesson 1316 of 1570
How an AI Model Actually Gets 'Trained' (No Math)
'Training data,' 'fine-tuning,' 'RLHF' — the words sound mysterious. The actual process is three clear stages.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1The big idea
- 2Pretraining vs Fine-tuning — Why It Matters
- 3Pretraining vs Fine-tuning — Why It Matters
Concept cluster
Terms to connect while reading
Section 1
The big idea
Modern AI models go through 3 stages: (1) Pretraining — read trillions of words of internet text; (2) Fine-tuning — read carefully curated examples of 'good' responses; (3) RLHF (Reinforcement Learning from Human Feedback) — humans rate pairs of responses and the model learns which kind people prefer. Each stage costs more than the last per data point but uses way less data.
Some examples
- GPT-4 pretraining used roughly 13 trillion tokens — more than every book ever published, plus most of the internet.
- Fine-tuning uses maybe 100,000 hand-written 'this is how to respond well' examples.
- RLHF uses ~1 million human comparisons of 'response A is better than response B.'
- Constitutional AI (Anthropic's approach) replaces some human ratings with the model rating itself against a written 'constitution' of values.
Try it!
Read Anthropic's Constitutional AI paper summary on their website (no math, just plain English about how they trained Claude). It takes 15 minutes and you'll understand more about modern AI than 95% of people.
Section 2
Pretraining vs Fine-tuning — Why It Matters
Section 3
Pretraining vs Fine-tuning — Why It Matters
Two stages built every modern AI. Knowing the difference helps you understand why models behave so differently.
What to actually do
- Pretraining is months of compute, billions of dollars, all the text on the web
- Fine-tuning is the part that makes it polite, helpful, and refuse certain stuff
- RLHF is fine-tuning where humans rank responses to teach preferences
The big idea: Two stages make AI: read everything, then learn how to behave. Both stages matter.
Key terms in this lesson
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “How an AI Model Actually Gets 'Trained' (No Math)”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 40 min
AI and Why Companies 'Fine-Tune' Their Own AI
Companies retrain AI on their own data — that's fine-tuning, and it's different from prompting.
Builders · 7 min
AI and the training data question: where did all this knowledge come from?
Understand what AI was trained on and why that shapes everything it says.
Creators · 35 min
Transfer Learning
Models trained on one task can often do many others. Understanding why is one of the deepest lessons in modern ML.
