Tendril — AI Lessons for Real Life

Tendril

Keeping Agents Safe

Agents that act in the real world need safety measures — spending limits, approval gates, audit logs.

Without safety measures, agents will eventually do something unintended. With them, the impact is bounded.

Three rules for any agent in your life

Set hard spending limits before giving access to money

Require approval for irreversible actions

Review the trace logs weekly

The big idea: Agent safety is about bounded surprises. Safety measures keep mistakes small.

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-builders-agentic-agent-safety

An approval gate in agent safety is best described as:

A memory system that helps the agent remember preferences
A requirement that a human must ok certain actions before the agent proceeds
A wall that prevents the agent from accessing the internet
A timer that slows down the agent

What is the purpose of an audit log?

To record everything the agent does so it can be reviewed later
To automatically delete files the agent doesn't need
To speed up the agent's calculations
To translate the agent's language into human words

What does it mean that agent safety is about 'bounded surprises'?

The agent should always do exactly what you expect
Surprises should happen at specific times only
The agent should surprise you with new features
Mistakes the agent makes should have limited impact

Which action would most need an approval gate?

Changing a screen brightness setting
Sending a daily weather update email
Deleting all files in a cloud storage account
Adding an event to a calendar

Why should spending limits be set before giving an agent financial access?

So the agent remembers your shopping preferences
So the agent cannot spend beyond a set maximum amount
So the agent works faster when spending money
So the agent can get discounts on purchases

What is 'sandboxing' in the context of agent safety?

Letting the agent play in a virtual game world
Running the agent in an isolated environment to limit potential damage
Creating a safe area for the agent to store files
Training the agent on sandboxed data sets

What is the benefit of reviewing audit logs weekly?

You can teach the agent new skills
The agent will run faster
The logs will delete themselves automatically
You can spot problems or unusual behavior early

Which of these is an example of a 'guardrail' for an agent?

A microphone that lets the agent hear sounds
A faster processor that makes the agent run quicker
A colorful interface that makes the agent look nice
A spending limit that stops the agent from overspending

Why is it risky to give an agent access to money without any safety measures?

The agent will eventually do something you didn't intend
The agent will become too smart
The agent will use less electricity
The agent will make better decisions

What makes an action 'irreversible' in the context of agent safety?

It cannot be undone once it is completed
It requires a password to start
It can only be done during business hours
It takes more than one hour to complete

A trace log is another name for what safety measure?

An approval gate that checks permissions
An audit log that records agent actions
A sandbox that isolates the agent
A spending limit that tracks purchases

What is the safest way to first give an agent access to a bank account?

Set a low daily spending limit and require approval for large amounts
Give full access so the agent can work efficiently
Allow access only on weekends
Deny any access to money

What happens when safety measures are NOT put on an agent?

The agent may eventually cause unintended problems
The agent will work perfectly all the time
The agent will need less electricity
The agent will learn faster

Which safety measure records what an agent did after the fact?

Audit log
Sandbox
Approval gate
Spending limit

What type of actions should require approval gates?

Actions that cannot be undone
Actions that use a lot of memory
Actions that are very simple
Actions that happen very quickly

Keeping Agents Safe

Agents that act in the real world need safety measures — spending limits, approval gates, audit logs.

Without safety measures, agents will eventually do something unintended. With them, the impact is bounded.

Three rules for any agent in your life

Set hard spending limits before giving access to money

Require approval for irreversible actions

Review the trace logs weekly

The big idea: Agent safety is about bounded surprises. Safety measures keep mistakes small.

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-builders-agentic-agent-safety

An approval gate in agent safety is best described as:

A memory system that helps the agent remember preferences
A requirement that a human must ok certain actions before the agent proceeds
A wall that prevents the agent from accessing the internet
A timer that slows down the agent

What is the purpose of an audit log?

To record everything the agent does so it can be reviewed later
To automatically delete files the agent doesn't need
To speed up the agent's calculations
To translate the agent's language into human words

What does it mean that agent safety is about 'bounded surprises'?

The agent should always do exactly what you expect
Surprises should happen at specific times only
The agent should surprise you with new features
Mistakes the agent makes should have limited impact

Which action would most need an approval gate?

Changing a screen brightness setting
Sending a daily weather update email
Deleting all files in a cloud storage account
Adding an event to a calendar

Why should spending limits be set before giving an agent financial access?

So the agent remembers your shopping preferences
So the agent cannot spend beyond a set maximum amount
So the agent works faster when spending money
So the agent can get discounts on purchases

What is 'sandboxing' in the context of agent safety?

Letting the agent play in a virtual game world
Running the agent in an isolated environment to limit potential damage
Creating a safe area for the agent to store files
Training the agent on sandboxed data sets

What is the benefit of reviewing audit logs weekly?

You can teach the agent new skills
The agent will run faster
The logs will delete themselves automatically
You can spot problems or unusual behavior early

Which of these is an example of a 'guardrail' for an agent?

A microphone that lets the agent hear sounds
A faster processor that makes the agent run quicker
A colorful interface that makes the agent look nice
A spending limit that stops the agent from overspending

Why is it risky to give an agent access to money without any safety measures?

The agent will eventually do something you didn't intend
The agent will become too smart
The agent will use less electricity
The agent will make better decisions

What makes an action 'irreversible' in the context of agent safety?

It cannot be undone once it is completed
It requires a password to start
It can only be done during business hours
It takes more than one hour to complete

A trace log is another name for what safety measure?

An approval gate that checks permissions
An audit log that records agent actions
A sandbox that isolates the agent
A spending limit that tracks purchases

What is the safest way to first give an agent access to a bank account?

Set a low daily spending limit and require approval for large amounts
Give full access so the agent can work efficiently
Allow access only on weekends
Deny any access to money

What happens when safety measures are NOT put on an agent?

The agent may eventually cause unintended problems
The agent will work perfectly all the time
The agent will need less electricity
The agent will learn faster

Which safety measure records what an agent did after the fact?

Audit log
Sandbox
Approval gate
Spending limit

What type of actions should require approval gates?

Actions that cannot be undone
Actions that use a lot of memory
Actions that are very simple
Actions that happen very quickly