Loading lesson…
Emergent abilities make AI both more exciting and more dangerous. How do labs forecast what the next model will do — and what happens when they are wrong?
Investors, regulators, and safety teams all want to know: what will the next model be able to do? If abilities emerge in jumps, the question is harder than it sounds. You cannot measure a capability that does not exist yet.
| View | Argument |
|---|---|
| Real phenomenon | Abilities appear suddenly at scale thresholds |
| Measurement artifact | Smooth underlying progress, hidden by binary metrics |
| Likely both | Some abilities genuinely snap in, others are log-smooth |
The Schaeffer et al. (2023) paper argued that many reported emergent abilities disappear when you use continuous metrics. But follow-up work by other labs found residual sudden jumps even after metric smoothing. The honest answer is: it depends on the task.
Frontier labs like Anthropic, OpenAI, and Google DeepMind publish policies committing to specific evaluations at capability thresholds. Anthropic's Responsible Scaling Policy defines AI Safety Levels (ASL) and requires mitigations proportional to the risk tier.
The scariest part of a capability evaluation is the tasks nobody remembered to include.
— A safety researcher
The big idea: emergence makes forecasting genuinely hard. Responsible labs publish policies, run structured evals, and accept that surprise is a load-bearing assumption of the field.
8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-creators-emergence-and-capability-forecasting
What is the main idea of "Emergence, Capability Forecasting, and Safety"?
Which concept is most central to "Emergence, Capability Forecasting, and Safety"?
Which use of AI fits this topic best?
What should a careful learner remember about "ASL in brief"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about emergence be treated?
Name one way to verify an AI answer about emergence.
Which action would help you apply "Emergence, Capability Forecasting, and Safety" responsibly?