When a bug is real, the agent should prove it with a failing test before changing production code.
14 min · Reviewed 2026
Ask For The Test Before The Fix
When a bug is real, the agent should prove it with a failing test before changing production code.
Name the job before naming the tool.
Write the smallest useful scope the agent can finish.
Run the result as a user, not as a fan of the tool.
Inspect the diff, data access, and failure path before sharing.
Write a failing test for: free users can open paid lessons after refreshing. Do not edit app code until the test fails for the right reason.Use this as the working prompt or checklist for the lesson.
What should the user be able to do when this is finished?
What data should the app or agent never expose?
What test proves the change works?
What rollback path exists if the output is wrong?
End-of-lesson check
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-coder-tests-first-agent-creators
What is the primary benefit of having an AI agent reproduce a bug with a failing test before writing a fix?
It automatically writes the fix for you
It reduces the number of tests needed
It makes the code run faster
It proves the bug exists and gives a clear target for the fix
In test-driven development, what does the 'red-green' cycle refer to?
Writing a failing test, then making it pass, then refactoring
Testing on separate staging and production environments
Writing code first, then tests to verify it
Using red and green color coding in the IDE to mark files
A developer wants an AI agent to fix a login bug. What should they ask the agent to do FIRST?
Check the production database for corruption
Write the code to fix the login issue
Deploy a temporary patch to production
Write a test that reproduces the login failure
Why is it important to 'run the result as a user, not as a fan of the tool'?
To save development time and costs
To verify the actual functionality works for end users
To impress stakeholders with the demo
To make sure the tool looks visually appealing
What does it mean to 'name the job before naming the tool'?
Define the problem and goal before selecting how to solve it
Give your project a catchy title
Pick a framework before writing any code
Choose a programming language before starting
An AI agent can create a working demo quickly. What makes that demo safe enough for another person to use?
It has a polished user interface
It is observable, reversible, and has tests
It uses the latest AI model available
It uses the most popular libraries
What does 'scope' mean in the context of an AI coding task?
The amount of memory the program uses
The specific, contained piece of work the agent should complete
The visual area of a web page
The number of files in the project
Why might an AI agent 'patch around the symptom' instead of fixing the root cause of a bug?
Because it didn't first reproduce the actual failure with a test
Because the code is too complex
Because the user didn't ask nicely
Because the model's training data is outdated
What is a 'rollback path'?
A function that reverses the order of a list
A navigation route to return to a previous page
A way to undo a change if something goes wrong
A backup of the database
What is the 'smallest useful scope' for an AI coding task?
Just the documentation and comments
A single, complete piece of functionality the agent can finish
The entire application feature set
A rough sketch of the idea without code
In continuous integration, what role do automated tests play?
They automatically verify code changes work correctly
They are optional and rarely used
They slow down the build process
They replace the need for code reviews
When deploying AI-generated code, what question should be asked about data?
How fast the data loads in milliseconds
What color theme the data uses
What data the app or agent should never expose
Which programming language the data prefers
What risk exists when letting an AI agent fix a bug without first seeing a failing test?
The tests will take too long to run
The agent might patch symptoms rather than fix the root cause
The agent will refuse to write code
The test will cause the program to crash
What makes code 'observable'?
It uses descriptive variable names
It has many comments explaining each line
It is open source and public
It can be monitored and its behavior tracked
When should an AI agent consider the 'rollback path'?
Never, that's not the agent's responsibility
Only at the very start of a new project
After the code has already been deployed
When fixing bugs or making changes that might need to be undone