Loading lesson…
A 2015 paper from Microsoft Research let neural networks go 150 layers deep by adding a shortcut.
After AlexNet in 2012, everyone wanted deeper networks. VGG went 16 to 19 layers in 2014. GoogLeNet went 22. But researchers kept hitting a wall: making networks deeper made them worse, not better, even on training data.
In December 2015, Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun at Microsoft Research Asia published Deep Residual Learning for Image Recognition. They trained networks 152 layers deep and won ImageNet by a comfortable margin.
ResNets were a contagious idea. Skip connections or variants of them now appear in almost every successful architecture: U-Nets for segmentation, DenseNets, and crucially Transformers, which use residual connections around each attention block.
Is learning better networks as easy as stacking more layers? An obstacle to answering this question was the vanishing or exploding gradient problem.
— He et al., 2015
The big idea: a tiny architectural tweak unlocked an order of magnitude of depth. The lesson, repeated throughout AI, is that the right small idea compounds enormously.
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-history-resnets-creators
What is the core idea behind "ResNets and the Depth Breakthrough"?
Which term best describes a foundational idea in "ResNets and the Depth Breakthrough"?
A learner studying ResNets and the Depth Breakthrough would need to understand which concept?
Which of these is directly relevant to ResNets and the Depth Breakthrough?
Which of the following is a key point about ResNets and the Depth Breakthrough?
Which of these does NOT belong in a discussion of ResNets and the Depth Breakthrough?
What is the key insight about "The one-line idea" in the context of ResNets and the Depth Breakthrough?
Which statement accurately describes an aspect of ResNets and the Depth Breakthrough?
What does working with ResNets and the Depth Breakthrough typically involve?
Which of the following is true about ResNets and the Depth Breakthrough?
Which best describes the scope of "ResNets and the Depth Breakthrough"?
Which section heading best belongs in a lesson about ResNets and the Depth Breakthrough?
Which of the following is a concept covered in ResNets and the Depth Breakthrough?
Which of the following is a concept covered in ResNets and the Depth Breakthrough?
Which of the following is a concept covered in ResNets and the Depth Breakthrough?