AI in the wild — real examples from real practitioners

Tendril

Here’s what’s actually happening on X.

Real posts · verified first

Real posts from people who actually shipped this. Verified entries first, with attribution.

0 confirmed links3 cited from news112 representative

Showing 1 of 115 postsEvery entry links back to X

Jan Leike@janleike

Alignment researcher; Anthropic

Representative

Scalable oversight in practice1.1 years agoCritic-model oversight

Representative of Jan's threads on how alignment teams use weaker models to critique the outputs of stronger models — catching errors a human reviewer would miss because of volume.

“If you can't oversee it, you can't align it. Oversight scales or it doesn't.”

How to replicate

1.Pick a task where the strong model's outputs are too numerous to human-review.
2.Define a rubric of failure modes (factual, unsafe, off-spec).

Prompt template

You are a critic model. Read the output below produced by a stronger model for this task: <task>. Score it against this rubric: <list of failure modes>. For each mode, return pass/fail and quote the exact span that triggered the fail. Do not judge style — only the listed modes. If in doubt, mark fail.

Pitfall

Treating the critic as the ground truth. The critic is a filter for human attention, not a replacement for it — measure the critic against humans, not the other way around.

What you'll learn

•Why oversight is the rate-limiter on deploying strong models
•How a weaker critic can provide useful signal on a stronger model
•How to measure critic quality instead of assuming it
•Where scalable oversight research is still open

Glossaryrlhf hallucination system prompt

#alignment#oversight#evals

About this page(tap to expand)

Curated examples of practitioners using AI in the wild — builders, sellers, ops leads, teachers, artists. Every entry links to the original post. When we can’t verify a specific URL, we flag the entry as representative rather than inventing one.

Tendril keeps this list conservative on purpose. Entries marked representativedescribe patterns we’ve seen practitioners discuss publicly on X but whose exact URLs we haven’t individually verified. We’d rather flag an example as representative than fabricate a precise post-ID. If you see a wrong attribution or a broken link, tell us and we’ll fix it within the day.