Lesson 1246 of 1455
How to Tell If Your Agent Run Was Actually Good
Score your agent on outcome, not on how clever the trace looked.
Builders · Agentic AI · ~4 min read
The big idea
a pretty trace that fails the task is still a failure
Some examples
- Did the test suite end green
- Was the PR mergeable
- How many human nudges did it need
Try it!
Open your favorite AI tool and try one of the examples above. Pick the one that matches what you are actually working on this week. Spend 10 minutes, no more. Notice what worked and what did not — that's the real lesson.
Key terms in this lesson
Practice this safely
Try this with a school, hobby, or family example where the stakes are low. Use the AI output as a draft you can question, not as the final answer.
- 1Ask AI to explain outcome metric in plain language, then underline anything that sounds uncertain or too broad.
- 2Give it one detail from "How to Tell If Your Agent Run Was Actually Good" and ask for two possible next steps plus one reason each step might be wrong.
- 3Check run quality against a trusted source, teacher, adult, expert, or original document before you use it.
End-of-lesson quiz
Check what stuck
8 questions · Score saves to your progress.
Lesson help
Questions are best handled with a grown-up here.
For this age range, Tendril keeps freeform AI chat paused until parent/guardian consent and child-safe moderation are fully verified. Use the quiz, notes, and related lessons below, or ask a parent, guardian, teacher, or librarian to work through the question with you.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 40 min
Builder Capstone: Design an Agent for Your Life
No code. Just design. Pick a real task you do every week and draft a complete agent spec — goal, tools, loop, stop, approvals, and what success looks like.
Builders · 40 min
MCP — How Agents Connect to Tools
MCP (Model Context Protocol) is a standard way for agents to safely talk to tools.
Builders · 40 min
Reading an Agent Trace
A trace is the full record of what an agent did and why.
