Lesson 288 of 2116
Geographic Bias: The West Dominates
AI has a geography problem. Training data over-represents North America and Europe, and it shows in subtle and not-so-subtle ways.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1AI Thinks Everyone Lives in California
- 2geographic bias
- 3Western-centric
- 4global datasets
Concept cluster
Terms to connect while reading
Section 1
AI Thinks Everyone Lives in California
Ask an image model for a photo of a wedding and you will most likely get a white couple, Western dress, a Christian-style ceremony. Ask for a photo of a house and you will get a suburban American home. This is not because the models hate other places. It is because the data does not show them enough of the world.
The data skew
Compare the options
| Region | Share of ImageNet | Share of world pop. |
|---|---|---|
| United States | 45% | 4% |
| Great Britain | 8% | 0.9% |
| Rest of Europe | 15% | 9% |
| China | 3% | 18% |
| India | 2% | 18% |
| Africa (entire continent) | 1.5% | 17% |
Real consequences
- Object detection misidentifies non-Western household items (sari confused for other garments)
- Self-driving cars trained on US roads fail on Delhi traffic or unmarked roads
- Medical imaging models fail on tropical diseases underrepresented in Western training sets
- Content moderation misreads cultural context (dismissing an insult it has never seen)
The Dollar Street project
Gapminder's Dollar Street photographed thousands of homes across 50+ countries and organized them by household income. Training on Dollar Street dramatically improves model performance on low-income households globally. It is a beautiful example of what intentional geographic collection produces.
Efforts to fix the gap
- Masakhane: community-driven NLP datasets for African languages
- AI4Bharat: tools and data for 22 Indian languages
- Common Voice: Mozilla's crowdsourced speech in 100+ languages
- WildReceipt: receipts from low-resource languages, not just English
- GLOBAL-MMLU: a multi-country benchmark extending MMLU
Key terms in this lesson
The big idea: AI trained on Western data inherits Western assumptions about what normal looks like. Global AI requires global data, and global data requires global investment.
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Geographic Bias: The West Dominates”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Creators · 30 min
Debate Prep: Researching Both Sides Fast
Debate rewards knowing the other side's best argument better than they do. AI is built for exactly this kind of fast, balanced research.
Creators · 35 min
Running a Literature Review With AI
AI turns weeks of literature review into days — if you know how to use it. Here is a workflow that actually works.
Creators · 30 min
Citing AI-Assisted Work Honestly
The norms for disclosing AI use in research are still being written. Here is the emerging consensus and how to stay on the right side of it.
