Tendril

Lesson 261 of 2116

Uncertainty Quantification in LLMs

A model that says 'I am 95 percent sure' and is wrong 40 percent of the time is miscalibrated. Measuring that gap is uncertainty quantification.

CreatorsAI Foundations~27 min readAdvancedBI2 · Representation & ReasoningBI3 · LearningBI5 · Societal ImpactPrint / PDF

Lesson map

What this lesson covers

45 min15 blocks4 concepts

Learning path

The main moves in order

1How Sure Is the Model, Really?
2uncertainty
3confidence
4entropy

Concept cluster

Terms to connect while reading

uncertaintyconfidenceentropycalibration

Sections4

Lists2

Notes4

Compare1

Quotes1

Section 1

How Sure Is the Model, Really?

LLMs produce a probability distribution over possible next tokens at every step. That distribution encodes how confident the model is. But a confident-sounding answer in English is not the same as the model's internal probability — and the gap between them is where uncertainty quantification lives.

Three kinds of uncertainty

Aleatoric: noise inherent in the data (different annotators would label differently)
Epistemic: uncertainty from the model not having seen enough
Model: uncertainty from choice of architecture or training

Signals you can actually read

Compare the options

Signal	What it captures	How to read
Token log-probabilities	Sequence probability	Low average logprob = uncertain answer
Entropy of next-token distribution	How spread out predictions are	High entropy at choice points = branching
Semantic consistency across samples	Meaning-level uncertainty	Same answer from 5 samples = confident
Verbalized confidence	Self-reported probability	Often miscalibrated, but easy

Check-in 1. Got it so far?

Why it matters in practice

1Let low-confidence answers trigger a tool call or human review
2Abstain from answering when uncertainty is too high
3Surface uncertainty in the UI so users can weigh it
4Track calibration over time as a quality metric

Check-in 2. Got it so far?

“A responsible model should not just give you an answer. It should tell you how much to trust it.”
A common refrain in AI safety literature

Key terms in this lesson

The big idea: confidence without calibration is noise. Quantifying uncertainty turns an LLM from a slot machine into a sensor.

Check-in 3. Got it so far?

End-of-lesson quiz

Check what stuck

15 questions · Score saves to your progress.

Tutor

Curious about “Uncertainty Quantification in LLMs”?

Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.

Progress saved locally in this browser. Sign in to sync across devices.

Related lessons

Uncertainty Quantification in LLMs

How Sure Is the Model, Really?

Three kinds of uncertainty

Signals you can actually read

Why it matters in practice

Curious about “Uncertainty Quantification in LLMs”?

Keep going

Uncertainty Quantification in LLMs

How Sure Is the Model, Really?

Three kinds of uncertainty

Signals you can actually read

Why it matters in practice

Curious about “Uncertainty Quantification in LLMs”?

Keep going