Loading lesson…
Some capabilities grow smoothly with scale. Others seem to appear out of nowhere. Telling them apart is a whole research program. The Big Question Is AI capability a smooth climb or a staircase?
Is AI capability a smooth climb or a staircase? The answer is probably 'both, depending on how you measure.' Understanding the argument is central to forecasting what the next generation of models will and will not do.
Wei et al. (2022) catalogued capabilities that appeared to 'emerge' at particular scales — arithmetic, instruction following, in-context learning. Below a threshold, performance was near random; above it, performance jumped sharply.
Schaeffer, Miranda, and Koyejo (2023) argued that many emergent abilities are a function of the metric, not the model. Switch from strict exact-match to partial-credit scoring, and the cliff becomes a gentle hill. Emergence might be about how we look, not what is there.
| View | Claim | Implication |
|---|---|---|
| Strong emergence | Capabilities really do appear at thresholds | Forecasting is hard; surprises are inevitable |
| Mirage view | Smoothness is hidden by harsh metrics | Forecasting is possible with better metrics |
| Middle ground | Some emergence is real, some is measurement | Depends on task — check both framings |
Our findings suggest that existing claims of emergent abilities are creations of the researcher's choice of metrics.
— Schaeffer et al., Are Emergent Abilities of Large Language Models a Mirage? (2023)
The big idea: whether AI capabilities emerge suddenly or grow smoothly depends partly on how you look. Either way, the surprises are real enough to plan for.
8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-creators-emergence-vs-scaling
What is the main idea of "Emergence vs. Scaling"?
Which concept is most central to "Emergence vs. Scaling"?
Which use of AI fits this topic best?
What should a careful learner remember about "Why this is not settled"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about emergence be treated?
Name one way to verify an AI answer about emergence.
Which action would help you apply "Emergence vs. Scaling" responsibly?