Loading lesson…
Emergent abilities make AI both more exciting and more dangerous. How do labs forecast what the next model will do — and what happens when they are wrong?
Investors, regulators, and safety teams all want to know: what will the next model be able to do? If abilities emerge in jumps, the question is harder than it sounds. You cannot measure a capability that does not exist yet.
| View | Argument |
|---|---|
| Real phenomenon | Abilities appear suddenly at scale thresholds |
| Measurement artifact | Smooth underlying progress, hidden by binary metrics |
| Likely both | Some abilities genuinely snap in, others are log-smooth |
The Schaeffer et al. (2023) paper argued that many reported emergent abilities disappear when you use continuous metrics. But follow-up work by other labs found residual sudden jumps even after metric smoothing. The honest answer is: it depends on the task.
Frontier labs like Anthropic, OpenAI, and Google DeepMind publish policies committing to specific evaluations at capability thresholds. Anthropic's Responsible Scaling Policy defines AI Safety Levels (ASL) and requires mitigations proportional to the risk tier.
The scariest part of a capability evaluation is the tasks nobody remembered to include.
— A safety researcher
The big idea: emergence makes forecasting genuinely hard. Responsible labs publish policies, run structured evals, and accept that surprise is a load-bearing assumption of the field.
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-creators-emergence-and-capability-forecasting
What is the core idea behind "Emergence, Capability Forecasting, and Safety"?
Which term best describes a foundational idea in "Emergence, Capability Forecasting, and Safety"?
A learner studying Emergence, Capability Forecasting, and Safety would need to understand which concept?
Which of these is directly relevant to Emergence, Capability Forecasting, and Safety?
Which of the following is a key point about Emergence, Capability Forecasting, and Safety?
Which of these does NOT belong in a discussion of Emergence, Capability Forecasting, and Safety?
Which statement is accurate regarding Emergence, Capability Forecasting, and Safety?
Which of these does NOT belong in a discussion of Emergence, Capability Forecasting, and Safety?
What is the key insight about "ASL in brief" in the context of Emergence, Capability Forecasting, and Safety?
What is the key insight about "Unknown unknowns remain the hard problem" in the context of Emergence, Capability Forecasting, and Safety?
What is the recommended tip about "Ground your practice in fundamentals" in the context of Emergence, Capability Forecasting, and Safety?
Which statement accurately describes an aspect of Emergence, Capability Forecasting, and Safety?
What does working with Emergence, Capability Forecasting, and Safety typically involve?
Which of the following is true about Emergence, Capability Forecasting, and Safety?
Which best describes the scope of "Emergence, Capability Forecasting, and Safety"?