Lesson 279 of 2116
Labeling at Scale: The Hidden Human Layer
Behind every supervised model is an army of human labelers. Understanding how labeling works is understanding who really builds AI.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1The Invisible Workforce
- 2labeling
- 3annotation
- 4RLHF
Concept cluster
Terms to connect while reading
Section 1
The Invisible Workforce
When you interact with a polite, helpful model like Claude, you are interacting with the labor of tens of thousands of human labelers. They wrote example responses, ranked model outputs, flagged harmful content, and drew bounding boxes around objects in millions of images.
What labeling looks like
- Image: draw a box around every car in this photo
- Text: does this comment violate community standards?
- Speech: transcribe this 30-second clip
- Ranking: which of these two AI answers is better?
- Red-teaming: try to get the model to produce harmful content
RLHF changed everything
Reinforcement Learning from Human Feedback (RLHF) is the technique that turned raw language models like GPT-3 into helpful assistants like ChatGPT. Humans rank pairs of model responses, and a reward model learns to mimic their preferences. OpenAI disclosed this pipeline in their InstructGPT paper.
A single RLHF preference comparison
Prompt: Explain why the sky is blue.
Response A: Because blue. Moving on.
Response B: Sunlight scatters as it passes through
the atmosphere. Shorter blue wavelengths scatter
more, so the sky appears blue to our eyes.
Labeler picks: B is better.
(This preference trains the reward model.)Who does the labeling?
Quality control in labeling
- Inter-annotator agreement: have multiple labelers label the same item
- Gold questions: secretly insert questions with known answers
- Majority voting: take the most common answer across N labelers
- Expert review: escalate edge cases to trained reviewers
- Calibration: train labelers on examples until agreement rises
Key terms in this lesson
The big idea: AI is not magic, it is a lot of people quietly doing repetitive, sometimes traumatic work to make machines seem smart. Responsible AI includes responsible labor practices.
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Labeling at Scale: The Hidden Human Layer”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Creators · 45 min
Creating Your First Small Labeled Dataset
Creating a dataset from scratch teaches you more than using someone else's. Here is how to build a high-quality small labeled dataset for a real task.
Creators · 28 min
Inter-Annotator Agreement: Measuring Reality
If two reasonable humans cannot agree on a label, neither can a model. Inter-annotator agreement tells you if a task is even well-defined.
Creators · 45 min
Open vs. Closed Models: Philosophy and Strategy
Open-source AI is both a technical movement and a political one. Understand the arguments so you can pick a stack and defend it.
