Geographic Bias: The West Dominates

AI has a geography problem. Training data over-represents North America and Europe, and it shows in subtle and not-so-subtle ways.

28 min · Reviewed 2026

AI Thinks Everyone Lives in California

Ask an image model for a photo of a wedding and you will most likely get a white couple, Western dress, a Christian-style ceremony. Ask for a photo of a house and you will get a suburban American home. This is not because the models hate other places. It is because the data does not show them enough of the world.

The data skew

Region	Share of ImageNet	Share of world pop.
United States	45%	4%
Great Britain	8%	0.9%
Rest of Europe	15%	9%
China	3%	18%
India	2%	18%
Africa (entire continent)	1.5%	17%

Real consequences

Object detection misidentifies non-Western household items (sari confused for other garments)
Self-driving cars trained on US roads fail on Delhi traffic or unmarked roads
Medical imaging models fail on tropical diseases underrepresented in Western training sets
Content moderation misreads cultural context (dismissing an insult it has never seen)

The Dollar Street project

Gapminder's Dollar Street photographed thousands of homes across 50+ countries and organized them by household income. Training on Dollar Street dramatically improves model performance on low-income households globally. It is a beautiful example of what intentional geographic collection produces.

Efforts to fix the gap

Masakhane: community-driven NLP datasets for African languages
AI4Bharat: tools and data for 22 Indian languages
Common Voice: Mozilla's crowdsourced speech in 100+ languages
WildReceipt: receipts from low-resource languages, not just English
GLOBAL-MMLU: a multi-country benchmark extending MMLU

The big idea: AI trained on Western data inherits Western assumptions about what normal looks like. Global AI requires global data, and global data requires global investment.

End-of-lesson check

8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-data-geographic-bias

What is the main idea of "Geographic Bias: The West Dominates"?
1. AI has a geography problem. Training data over-represents North America and Europe, and it shows in subtle and not-so-subtle ways.
2. Use AI as the final authority for the whole decision
3. Avoid checking the answer once it sounds polished
4. Focus only on speed instead of judgment
Which concept is most central to "Geographic Bias: The West Dominates"?
1. Western-centric
2. geographic bias
3. global datasets
4. Dollar Street
Which use of AI fits this topic best?
1. Let the AI decide what matters without your review
2. Use the answer before checking whether it fits the situation
3. Object detection misidentifies non-Western household items (sari confused for other garments)
4. Treat the AI output as automatically correct
What should a careful learner remember about "Source: Shankar et al., 2017"?
1. Use "Source: Shankar et al., 2017" as a reminder to verify the AI output before anyone relies on it.
2. Skip the context so the tool can guess faster
3. Treat the output as private even after sharing it online
4. Use the answer without checking the source
You want to use AI after this lesson. What is the safest next step?
1. Act immediately because the AI answer is written clearly
2. Use AI for drafting and comparison, but verify before publishing or relying on it.
3. Hide uncertainty so the final answer looks cleaner
4. Use private or sensitive details before checking permission
How should AI output about geographic bias be treated?
1. As proof that no other source is needed
2. As a replacement for context, consent, or expert review
3. As a draft or helper output that still needs human judgment and verification
4. As something that becomes correct when it sounds confident
Name one way to verify an AI answer about geographic bias.
Which action would help you apply "Geographic Bias: The West Dominates" responsibly?
1. Use the tool to avoid thinking through the tradeoff
2. Keep going even if the output conflicts with a trusted source
3. Treat the AI output as automatically correct
4. Self-driving cars trained on US roads fail on Delhi traffic or unmarked roads

← Back to interactive lesson

Tendril · Creators · AI Foundations

Geographic Bias: The West Dominates

AI has a geography problem. Training data over-represents North America and Europe, and it shows in subtle and not-so-subtle ways.

28 min · Reviewed 2026

AI Thinks Everyone Lives in California

The data skew

Region	Share of ImageNet	Share of world pop.
United States	45%	4%
Great Britain	8%	0.9%
Rest of Europe	15%	9%
China	3%	18%
India	2%	18%
Africa (entire continent)	1.5%	17%

Real consequences

Object detection misidentifies non-Western household items (sari confused for other garments)
Self-driving cars trained on US roads fail on Delhi traffic or unmarked roads
Medical imaging models fail on tropical diseases underrepresented in Western training sets
Content moderation misreads cultural context (dismissing an insult it has never seen)

The Dollar Street project

Efforts to fix the gap

Masakhane: community-driven NLP datasets for African languages
AI4Bharat: tools and data for 22 Indian languages
Common Voice: Mozilla's crowdsourced speech in 100+ languages
WildReceipt: receipts from low-resource languages, not just English
GLOBAL-MMLU: a multi-country benchmark extending MMLU

The big idea: AI trained on Western data inherits Western assumptions about what normal looks like. Global AI requires global data, and global data requires global investment.

End-of-lesson check

8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-data-geographic-bias

What is the main idea of "Geographic Bias: The West Dominates"?
1. AI has a geography problem. Training data over-represents North America and Europe, and it shows in subtle and not-so-subtle ways.
2. Use AI as the final authority for the whole decision
3. Avoid checking the answer once it sounds polished
4. Focus only on speed instead of judgment
Which concept is most central to "Geographic Bias: The West Dominates"?
1. Western-centric
2. geographic bias
3. global datasets
4. Dollar Street
Which use of AI fits this topic best?
1. Let the AI decide what matters without your review
2. Use the answer before checking whether it fits the situation
3. Object detection misidentifies non-Western household items (sari confused for other garments)
4. Treat the AI output as automatically correct
What should a careful learner remember about "Source: Shankar et al., 2017"?
1. Use "Source: Shankar et al., 2017" as a reminder to verify the AI output before anyone relies on it.
2. Skip the context so the tool can guess faster
3. Treat the output as private even after sharing it online
4. Use the answer without checking the source
You want to use AI after this lesson. What is the safest next step?
1. Act immediately because the AI answer is written clearly
2. Use AI for drafting and comparison, but verify before publishing or relying on it.
3. Hide uncertainty so the final answer looks cleaner
4. Use private or sensitive details before checking permission
How should AI output about geographic bias be treated?
1. As proof that no other source is needed
2. As a replacement for context, consent, or expert review
3. As a draft or helper output that still needs human judgment and verification
4. As something that becomes correct when it sounds confident
Name one way to verify an AI answer about geographic bias.
Which action would help you apply "Geographic Bias: The West Dominates" responsibly?
1. Use the tool to avoid thinking through the tradeoff
2. Keep going even if the output conflicts with a trusted source
3. Treat the AI output as automatically correct
4. Self-driving cars trained on US roads fail on Delhi traffic or unmarked roads

← Back to interactive lesson