Loading lesson…
Sometimes a network memorizes, then — long after you would have stopped training — suddenly generalizes. That is grokking, a real and weird phenomenon. Why it matters beyond the toy Grokking suggests that 'more training' can sometimes qualitatively change a model's behavior — not just improve a score but switch to a different algorithm internally.
In 2022, Power, Burda, Edwards, and colleagues at OpenAI reported something strange: a small transformer trained on modular arithmetic would reach ~100% training accuracy while test accuracy stayed near zero for thousands of epochs. Then, suddenly, test accuracy would snap to ~100%. They called the phenomenon grokking, after the Heinlein word for 'to understand fully.'
Accuracy
1.0 | Train ______________________
| /
| / Test
0.5 | / ________
| / /
0.0 |____/__________/______________ Time
memorize generalize
(early) (much later)Training accuracy saturates. Test accuracy stays low — then snaps up far later.Grokking suggests that 'more training' can sometimes qualitatively change a model's behavior — not just improve a score but switch to a different algorithm internally. That has implications for how we evaluate safety during training.
We show that neural networks can 'grok' algorithmic tasks, generalizing well after overfitting the training set.
— Power et al., Grokking paper (2022)
The big idea: learning is not monotonic. Grokking proves that long after training looks 'done,' the internal algorithm can still be changing.
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-creators-grokking
What is the core idea behind "Grokking: Learning That Snaps Into Place"?
Which term best describes a foundational idea in "Grokking: Learning That Snaps Into Place"?
A learner studying Grokking: Learning That Snaps Into Place would need to understand which concept?
Which of these is directly relevant to Grokking: Learning That Snaps Into Place?
Which of the following is a key point about Grokking: Learning That Snaps Into Place?
Which of these does NOT belong in a discussion of Grokking: Learning That Snaps Into Place?
Which statement is accurate regarding Grokking: Learning That Snaps Into Place?
Which of these does NOT belong in a discussion of Grokking: Learning That Snaps Into Place?
What is the key insight about "Mechanistic interpretability found the circuit" in the context of Grokking: Learning That Snaps Into Place?
What is the key insight about "It is not universal" in the context of Grokking: Learning That Snaps Into Place?
What is the recommended tip about "Ground your practice in fundamentals" in the context of Grokking: Learning That Snaps Into Place?
Which statement accurately describes an aspect of Grokking: Learning That Snaps Into Place?
What does working with Grokking: Learning That Snaps Into Place typically involve?
Which of the following is true about Grokking: Learning That Snaps Into Place?
Which best describes the scope of "Grokking: Learning That Snaps Into Place"?