Tendril

Lesson 1001 of 1169

Humans Gave AI Thumbs Up to Train It

AI got better because humans clicked thumbs up or thumbs down.

Explorers · AI Foundations · ~4 min read

Print / PDF

The big idea

Humans rated AI answers thumbs up or thumbs down. AI learned to give more answers humans liked. This is called RLHF.

Some examples

RLHF stands for Reinforcement Learning from Human Feedback.
Workers read AI answers all day and rated them.
Good answers got copied. Bad answers got avoided.
That is why modern AI sounds nicer and clearer than older AI.

Try it!

When AI gives a great answer, click thumbs up. That feedback can help train future AI versions.

Key terms in this lesson

Practice this safely

Try this with a low-stakes example and a trusted adult nearby. The goal is to notice how AI talks about RLHF, not to let it make the decision for you.

1Ask AI to explain RLHF in plain language, then underline anything that sounds uncertain or too broad.
2Give it one detail from "Humans Gave AI Thumbs Up to Train It" and ask for two possible next steps plus one reason each step might be wrong.
3Check human feedback against a trusted source, teacher, adult, expert, or original document before you use it.

End-of-lesson quiz

Check what stuck

8 questions · Score saves to your progress.

Lesson help

Questions are best handled with a grown-up here.

For this age range, Tendril keeps freeform AI chat paused until parent/guardian consent and child-safe moderation are fully verified. Use the quiz, notes, and related lessons below, or ask a parent, guardian, teacher, or librarian to work through the question with you.

Progress saved locally in this browser. Sign in to sync across devices.

Related lessons

Humans Gave AI Thumbs Up to Train It

The big idea

Some examples

Try it!

Practice this safely

Questions are best handled with a grown-up here.

Keep going

Humans Gave AI Thumbs Up to Train It

The big idea

Some examples

Try it!

Practice this safely

Questions are best handled with a grown-up here.

Keep going