Loading lesson…
AI did not start in 2022. It has decades of wrong turns and breakthroughs. Knowing the history helps you spot hype from real progress.
AI research began in the 1950s. The field has gone through booms and winters — periods of huge funding followed by collapse. Understanding this rhythm helps you calibrate today's excitement.
Early researchers believed intelligence was logic. They wrote programs that manipulated symbols according to formal rules. Expert systems of the 1980s, like MYCIN for medical diagnosis, encoded human expert knowledge as if-then rules. They worked for narrow problems but crumbled outside them.
Researchers pivoted to data-driven methods. Support vector machines, decision trees, and shallow neural networks dominated. IBM's Deep Blue beat Kasparov at chess in 1997, but it was hand-tuned search, not general intelligence.
In 2012, a neural network called AlexNet won the ImageNet competition by a huge margin, kicking off the deep learning revolution. GPUs, big datasets, and backpropagation combined to finally make deep networks trainable. By 2016, AlphaGo beat the world champion at Go, a feat nobody thought was close.
The 2017 paper Attention Is All You Need introduced the transformer architecture. It replaced the recurrent networks used for language with a simpler, more parallel structure. Every modern LLM — GPT, Claude, Gemini, Llama — is a transformer at heart.
| Year | Milestone |
|---|---|
| 2017 | Transformer architecture published |
| 2018 | BERT and GPT-1 released |
| 2020 | GPT-3 shows few-shot learning |
| 2022 | ChatGPT makes AI mainstream |
| 2023-2024 | GPT-4, Claude 3, Gemini, open-source Llama |
| 2025-2026 | Reasoning models, multimodality, agentic systems |
AI winters end not with a new theory, but with enough compute.
— A long-time researcher
The big idea: today's AI is the fourth major wave, built on GPUs, internet-scale data, and the transformer. Knowing the cycle helps you see that hype is not new, but neither is real progress.
8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-builders-history-of-ai
What is the main idea of "A Short History: From Expert Systems to Transformers"?
Which concept is most central to "A Short History: From Expert Systems to Transformers"?
Which use of AI fits this topic best?
What should a careful learner remember about "Why GPUs mattered"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about AI history be treated?
Name one way to verify an AI answer about AI history.
Which action would help you apply "A Short History: From Expert Systems to Transformers" responsibly?