Lesson 293 of 2116
Variance and Standard Deviation: How Spread Out?
Mean tells you the center. Variance and standard deviation tell you the spread. Without both, you are missing half the story.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1Two Classes With the Same Average
- 2variance
- 3standard deviation
- 4dispersion
Concept cluster
Terms to connect while reading
Section 1
Two Classes With the Same Average
Both classes have an average test score of 80. Class A: everyone got between 78 and 82. Class B: half got 60 and half got 100. Same mean, very different stories. Variance and standard deviation capture this.
The formula
Two classes, same mean, different spreads
import numpy as np
class_a = [78, 79, 80, 80, 81, 82]
class_b = [60, 60, 60, 100, 100, 100]
print('Class A mean:', np.mean(class_a)) # 80
print('Class A std:', np.std(class_a)) # ~1.3
print('Class B mean:', np.mean(class_b)) # 80
print('Class B std:', np.std(class_b)) # ~20
# Same mean, very different spreads
# Class B's standard deviation is ~15x Class A'sThe 68-95-99.7 rule
For a normal (bell-shaped) distribution, about 68 percent of values fall within one standard deviation of the mean, 95 percent within two, and 99.7 percent within three. This is the empirical rule, and it lets you eyeball whether a single value is normal or unusual.
Why ML cares about variance
- Feature scaling: algorithms like SVMs and neural nets expect features with similar variance
- Model evaluation: reporting mean accuracy across runs without standard deviation is misleading
- Z-scores: (x - mean) / std is a common normalization step
- Detecting drift: if a feature's variance shifts over time, data is changing
Standard deviation in the wild
- IQ scores: mean 100, std 15. A score of 130 is two standard deviations above average.
- US adult male height: mean 175 cm, std 7 cm. A 190 cm man is about two std above average.
- Daily stock returns: often a std of 1 to 2 percent for broad indices
Key terms in this lesson
The big idea: a dataset is a cloud, not a point. Mean locates the center; standard deviation describes the cloud's fuzziness. Always report both.
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Variance and Standard Deviation: How Spread Out?”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Creators · 45 min
Open vs. Closed Models: Philosophy and Strategy
Open-source AI is both a technical movement and a political one. Understand the arguments so you can pick a stack and defend it.
Creators · 32 min
AP Biology: Using AI to Survive the Vocab Tsunami
AP Bio has roughly a thousand terms and four big concepts. NotebookLM and Claude Projects can turn your textbook into a custom tutor that actually knows what you are studying.
Creators · 40 min
Golden-Dataset Curation
A golden dataset is a curated set of hard, representative examples you trust completely. It is the backbone of every serious eval.
