Loading lesson…
AI has a geography problem. Training data over-represents North America and Europe, and it shows in subtle and not-so-subtle ways.
Ask an image model for a photo of a wedding and you will most likely get a white couple, Western dress, a Christian-style ceremony. Ask for a photo of a house and you will get a suburban American home. This is not because the models hate other places. It is because the data does not show them enough of the world.
| Region | Share of ImageNet | Share of world pop. |
|---|---|---|
| United States | 45% | 4% |
| Great Britain | 8% | 0.9% |
| Rest of Europe | 15% | 9% |
| China | 3% | 18% |
| India | 2% | 18% |
| Africa (entire continent) | 1.5% | 17% |
Gapminder's Dollar Street photographed thousands of homes across 50+ countries and organized them by household income. Training on Dollar Street dramatically improves model performance on low-income households globally. It is a beautiful example of what intentional geographic collection produces.
The big idea: AI trained on Western data inherits Western assumptions about what normal looks like. Global AI requires global data, and global data requires global investment.
8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-data-geographic-bias
What is the main idea of "Geographic Bias: The West Dominates"?
Which concept is most central to "Geographic Bias: The West Dominates"?
Which use of AI fits this topic best?
What should a careful learner remember about "Source: Shankar et al., 2017"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about geographic bias be treated?
Name one way to verify an AI answer about geographic bias.
Which action would help you apply "Geographic Bias: The West Dominates" responsibly?