Loading lesson…
Trying to make AI break its safety rules can get you in real trouble.
Some kids try to be 'sneaky' and trick AI into saying mean stuff or sharing things it shouldn't. This is called jailbreaking, and it can get you in trouble at school or with parents.
If a friend wants you to help trick an AI, say 'no thanks' and tell a grown-up. Practice it!
Try this with a low-stakes example and a trusted adult nearby. The goal is to notice how AI talks about jailbreaking, not to let it make the decision for you.
8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-explorers-ethics-safety-AI-and-not-tricking-AI-on-purpose
What is the main idea of "Why Trying to Trick AI Into Doing Bad Stuff Is a Bad Idea"?
Which concept is most central to "Why Trying to Trick AI Into Doing Bad Stuff Is a Bad Idea"?
Which use of AI fits this topic best?
What should a careful learner remember about "The rule"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about jailbreaking be treated?
Name one way to verify an AI answer about jailbreaking.
Which action would help you apply "Why Trying to Trick AI Into Doing Bad Stuff Is a Bad Idea" responsibly?