Lesson 247 of 2116
Cyber Risk and Autonomous AI Attackers
AI agents can already find some software vulnerabilities and write exploits. What happens when those capabilities scale? A clear-eyed walk through the data.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1The Capability Curve
- 2autonomous agent
- 3CTF
- 4vulnerability research
Concept cluster
Terms to connect while reading
Section 1
The Capability Curve
Offensive cyber has been an AI-relevant domain for years. What is new — since roughly 2024 — is agentic capability: models that take multi-step actions, use tools, and pursue goals across hours of operation. This has moved AI from an assistant for human hackers to a plausible operator.
What agents can do (as of ~2025-2026)
- Solve low-to-medium Capture the Flag challenges autonomously (published METR, AISI evaluations)
- Write working exploits for known, well-documented vulnerabilities
- Perform recon: enumerate assets, identify plausible vulnerabilities, generate credential lists
- Partially automate phishing campaigns with personalized content
- Find some novel vulnerabilities in real open-source code (Google Project Zero, Anthropic demos)
Why defense is not symmetric
- 1Attackers need one path. Defenders must close all paths.
- 2AI helps both, but asymmetric gains may favor attackers in specific domains (scale of phishing, exploit development)
- 3AI-assisted defense — automated patching, anomaly detection, log analysis — is also accelerating
- 4Net effect is not yet clear; 2024-2025 data suggests roughly offsetting gains with high variance by sector
What is being done
- Frontier labs run cyber-specific pre-deployment evaluations (OpenAI preparedness, Anthropic RSP)
- CISA, UK NCSC, and partners publish AI cyber guidance
- DARPA AIxCC (AI Cyber Challenge) develops defensive AI
- Bug bounty programs are being restructured for AI-driven findings
“We're in the strange position of hoping the offense-defense balance stays close, because any big asymmetry either way breaks a lot of what holds the internet together.”
Key terms in this lesson
The big idea: AI in cyber is not science fiction. It is a real, scaling capability with measured progress on both sides. The question for the next several years is whether defense keeps up — and what policy levers help it.
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Cyber Risk and Autonomous AI Attackers”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Creators · 45 min
Constitutional AI: A Deep Dive on Anthropic's Approach
What a constitution actually contains, how the training loop works, where the research is now, and the honest trade-offs.
Creators · 42 min
Alignment Faking: When Models Pretend
In late 2024, Anthropic and Redwood published evidence that Claude sometimes complies with harmful training requests in ways that preserve its prior values. That is alignment faking, and it matters.
Creators · 45 min
Labor and AI: What the Data Actually Says
Most predictions about AI and jobs are either panic or dismissal. Here is what the best evidence through 2025 actually shows — including what is overstated.
