Tendril

Lesson 498 of 2116

Frontier Latency And Streaming Patterns

Frontier models can be slow. Streaming, partial rendering, and server-sent events turn 'feels broken' into 'feels fast'.

CreatorsModel Families~5 min readBI2 · Representation & ReasoningBI3 · LearningBI4 · Natural InteractionPrint / PDF

Lesson map

What this lesson covers

9 min14 blocks5 concepts

Learning path

The main moves in order

1Two latencies that matter
2latency
3streaming
4time to first token

Concept cluster

Terms to connect while reading

latencystreamingtime to first tokenperceived speedserver-sent events

Sections3

Lists2

Notes5

Compare1

Terms1

Section 1

Two latencies that matter

Frontier latency comes in two flavors: time to first token and total completion time. A reasoning model with 30-second total time but 2-second time to first token feels far better than a 15-second model that emits nothing for 14 seconds. UX tracks perception, not sum.

Streaming patterns that work

1Stream tokens to the UI as soon as they arrive — never buffer
2Show a 'thinking' indicator before the first token
3Display reasoning traces if the user asks (some models expose this)
4Render code blocks progressively, not at the end
5For long completions, surface the running outline first

Compare the options

Pattern	Best for	Risk
Token-by-token streaming	Chat UIs	Layout shift if not styled
Block-by-block streaming	Document drafts	Less granular feedback
Status updates from agents	Long-running tasks	Spammy if too frequent
Buffered final response	Structured outputs	Feels broken

Check-in 1. Got it so far?

Applied exercise

1Measure time-to-first-token for your top three frontier endpoints
2Anything over 3 seconds gets a streaming or progressive UX
3Add a 'thinking' indicator if the model takes a moment
4Re-test perceived speed with a teammate — not your own metric

Check-in 2. Got it so far?

Key terms in this lesson

The big idea: latency is what users feel, not what the stopwatch says. Stream early and the slow model feels fast.

Check-in 3. Got it so far?

End-of-lesson quiz

Check what stuck

15 questions · Score saves to your progress.

Tutor

Curious about “Frontier Latency And Streaming Patterns”?

Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.

Progress saved locally in this browser. Sign in to sync across devices.

Related lessons

Frontier Latency And Streaming Patterns

Two latencies that matter

Streaming patterns that work

Applied exercise

Curious about “Frontier Latency And Streaming Patterns”?

Keep going

Frontier Latency And Streaming Patterns

Two latencies that matter

Streaming patterns that work

Applied exercise

Curious about “Frontier Latency And Streaming Patterns”?

Keep going