Loading lesson…
Break a hard task into smaller subtasks. Solve each with an AI helper. Combine the answers. Repeat. That is iterative amplification, a blueprint for supervising things humans can't check alone.
Paul Christiano's 2018 framework begins with a thought experiment called HCH: a Human Consulting HCH. Imagine you can summon copies of yourself to answer small subquestions, and those copies can summon more copies. In the limit, a carefully-managed tree of humans answers questions none of them could answer alone.
The core bet is that if I can break a hard task into tasks I can do, and I can align an AI to do each one, I have aligned an AI to do the whole thing.
— Paul Christiano, Alignment Research Center
The big idea: amplification treats alignment as a property that must survive a training loop, not a gate you pass once. That framing influences a lot of modern safety work.
8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-safety2-iterative-amplification-creators
What is the main idea of "Iterative Amplification"?
Which concept is most central to "Iterative Amplification"?
Which use of AI fits this topic best?
What should a careful learner remember about "The safety claim"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about iterated amplification be treated?
Name one way to verify an AI answer about iterated amplification.
Which action would help you apply "Iterative Amplification" responsibly?