Loading lesson…
Agents fail in weird, quiet, expensive ways. Learn the six failure modes, the warning signs, and the simple habits that catch problems before they compound.
Even the best agents in 2026 — Claude Opus 4.7, Devin 2.0, ChatGPT Agents — fail somewhere around 15–40% of multi-step tasks on benchmarks like SWE-bench Verified and GAIA. That's the good news (they succeed most of the time). The bad news is the failures are often silent.
| Failure | What it looks like | How to catch it |
|---|---|---|
| Loop (stuck) | Agent retries the same failing step forever. | Max-step cap; log repeated actions. |
| Drift | Agent slowly wanders from the original goal. | Restate the goal every N steps. |
| Hallucinated tool | Agent invents a tool call that doesn't exist. | Strict tool schema validation. |
| Phantom success | Agent reports 'done' but didn't actually do it. | Verify with an independent check. |
| Cascade | Early wrong step poisons every later step. | Checkpoint state; allow rollback. |
| Runaway cost | Agent burns tokens/API calls without progress. | Budget cap; alert on cost per task. |
An agent writes a report and says 'I've emailed it to your team.' But it didn't — the email tool errored and the agent hallucinated the success. You find out three days later when someone asks about the report. Phantom success is the most damaging failure because it silently rots your trust.
BAD: 'I have sent the email to the marketing team.' (no proof, no message ID, no verification) GOOD: 'I sent the email. Tool returned: messageId="abc123", status="delivered", recipients=3. You can verify in /sent.'Force agents to quote tool output, not paraphrase it.The single best habit for working with agents: end every run by asking 'how do I know this actually happened?' If you can't answer, you didn't finish.
8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-agentic-why-agents-fail-builders
What is the main idea of "Why Agents Fail (and How to Notice)"?
Which concept is most central to "Why Agents Fail (and How to Notice)"?
Which limitation should you watch for in this topic?
What should a careful learner remember about "Never trust a paraphrased result"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about failure modes be treated?
Name one way to verify an AI answer about failure modes.
Which choice is a bad use of AI for this lesson?