Tendril — AI Lessons for Real Life

Tendril

Where Haiku shines

Classification and routing (which agent handles this ticket?)

Structured extraction from semi-clean docs

Tool-call decisions in multi-step agents

Autocomplete-style suggestions under 200ms

Metric	Haiku 4.5	Sonnet 4.6
Input / M tokens	~$1	$3
Output / M tokens	~$5	$15
Typical p50 latency	<1s	2-4s
Best for	routing, extraction, high QPS	reasoning, long docs, quality chat

Metric

Haiku 4.5

Sonnet 4.6

Input / M tokens

~$1

Output / M tokens

~$5

$15

Typical p50 latency

<1s

2-4s

Best for

routing, extraction, high QPS

reasoning, long docs, quality chat

client.messages.create( model="claude-haiku-4-5", max_tokens=200, messages=[{"role": "user", "content": f"Classify: {ticket}"}], )A routing call that costs a fraction of a cent.

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-modelx-claude-haiku-45-builders

A developer is building a ticket routing system that must handle 10,000 requests per minute. Which model would be most cost-effective for the initial classification step?

Claude Haiku 4.5, because its low cost per million tokens makes it economical for high-volume classification tasks
Claude Opus, because it provides the best latency for real-time applications
Claude Sonnet 4.6, because its higher quality ensures accurate routing decisions
Either model would work equally well since both can handle this QPS

What does the lesson mean when it says Haiku is 'the quiet workhorse'?

Haiku has the lowest market share of any Anthropic model
Haiku requires the least amount of computational resources to run
Haiku operates silently without generating any log output
Haiku is underappreciated but handles high-volume workloads reliably and affordably

A student asks what 'latency' means in the context of AI models. Which definition is correct?

The total number of tokens processed in a request
The time between sending a request and receiving the first response
The cost charged by the API for processing a request
The maximum throughput the model can handle per second

What does QPS stand for in the lesson's comparison table?

Question Processing System
Queries Per Second
Quality Performance Score
Quantum Processing Speed

A developer implements a system where Haiku handles initial document parsing, and if confidence is low, it escalates to Sonnet. What is this architectural pattern called?

Cascading
Checkpointing
Load balancing
Failover

A product manager wants to reduce API costs by 80% while maintaining quality on complex queries. Which approach does the lesson recommend?

Use Haiku for all requests and accept lower quality
Switch to a different AI provider entirely
Route simple requests to Haiku and escalate complex ones to Sonnet
Use Sonnet for all requests since it has better quality

Which task is the lesson LEAST likely to recommend Haiku for?

Classifying support tickets by department
Autocomplete suggestions in a search bar
Writing a long-form analytical essay
Extracting structured data from invoices

A company processes 1 million customer messages per day. Why might Haiku help their bottom line more than Sonnet?

Haiku provides better customer support features
Haiku is cheaper per token and optimized for high-volume workloads
Haiku has higher quality so customers are more satisfied
Haiku requires fewer API calls per message

The lesson mentions 'structured extraction from semi-clean docs.' What type of document would be most suitable for Haiku?

A scanned PDF with complex tables and merged cells
A handwritten letter with scribbles and cross-outs
A heavily redacted legal document with blacked-out sections
A completed web form with consistent fields

What is the primary reason to use Haiku for 'tool-call decisions in multi-step agents'?

Haiku has special tool-calling features other models lack
Agents need to make many quick decisions and Haiku's low latency keeps the agent responsive
Tool calls require high-quality reasoning only Sonnet provides
Tool calls are expensive and should be minimized

Based on the lesson, if a developer's priority is minimum time-to-first-token, which model should they choose?

Claude Sonnet 4.6
Claude Opus
Claude Haiku 4.5
Any Claude model will have similar latency

What does the lesson imply about using Haiku for 'autocomplete-style suggestions'?

It requires additional human review for every suggestion
It should be avoided because quality is too low
It is more expensive than other use cases
It works well due to low latency and can return suggestions under 200ms

A startup is building their first AI product with a limited budget. Why might the lesson suggest starting with Haiku?

Sonnet is too expensive for any startup to use
Haiku has better documentation for beginners
Haiku is the only model that works for startups
Starting with Haiku reduces costs during development and lets them scale affordably

The lesson states Haiku can handle 80% of work while cutting costs. What happens to the remaining 20% of requests?

They are queued for batch processing
They are rejected
They are handled by Sonnet after escalation
They are processed by a different AI provider

Why might a company choose NOT to use Haiku for long document summarization?

Haiku is too expensive for summarization
Haiku has lower token limits for input
Haiku does not support summarization
Long documents require reasoning and quality chat, which Sonnet handles better

Where Haiku shines

Classification and routing (which agent handles this ticket?)

Structured extraction from semi-clean docs

Tool-call decisions in multi-step agents

Autocomplete-style suggestions under 200ms

Metric	Haiku 4.5	Sonnet 4.6
Input / M tokens	~$1	$3
Output / M tokens	~$5	$15
Typical p50 latency	<1s	2-4s
Best for	routing, extraction, high QPS	reasoning, long docs, quality chat

Metric

Haiku 4.5

Sonnet 4.6

Input / M tokens

~$1

Output / M tokens

~$5

$15

Typical p50 latency

<1s

2-4s

Best for

routing, extraction, high QPS

reasoning, long docs, quality chat

client.messages.create( model="claude-haiku-4-5", max_tokens=200, messages=[{"role": "user", "content": f"Classify: {ticket}"}], )A routing call that costs a fraction of a cent.

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-modelx-claude-haiku-45-builders

A developer is building a ticket routing system that must handle 10,000 requests per minute. Which model would be most cost-effective for the initial classification step?

Claude Haiku 4.5, because its low cost per million tokens makes it economical for high-volume classification tasks
Claude Opus, because it provides the best latency for real-time applications
Claude Sonnet 4.6, because its higher quality ensures accurate routing decisions
Either model would work equally well since both can handle this QPS

What does the lesson mean when it says Haiku is 'the quiet workhorse'?

Haiku has the lowest market share of any Anthropic model
Haiku requires the least amount of computational resources to run
Haiku operates silently without generating any log output
Haiku is underappreciated but handles high-volume workloads reliably and affordably

A student asks what 'latency' means in the context of AI models. Which definition is correct?

The total number of tokens processed in a request
The time between sending a request and receiving the first response
The cost charged by the API for processing a request
The maximum throughput the model can handle per second

What does QPS stand for in the lesson's comparison table?

Question Processing System
Queries Per Second
Quality Performance Score
Quantum Processing Speed

A developer implements a system where Haiku handles initial document parsing, and if confidence is low, it escalates to Sonnet. What is this architectural pattern called?

Cascading
Checkpointing
Load balancing
Failover

A product manager wants to reduce API costs by 80% while maintaining quality on complex queries. Which approach does the lesson recommend?

Use Haiku for all requests and accept lower quality
Switch to a different AI provider entirely
Route simple requests to Haiku and escalate complex ones to Sonnet
Use Sonnet for all requests since it has better quality

Which task is the lesson LEAST likely to recommend Haiku for?

Classifying support tickets by department
Autocomplete suggestions in a search bar
Writing a long-form analytical essay
Extracting structured data from invoices

A company processes 1 million customer messages per day. Why might Haiku help their bottom line more than Sonnet?

Haiku provides better customer support features
Haiku is cheaper per token and optimized for high-volume workloads
Haiku has higher quality so customers are more satisfied
Haiku requires fewer API calls per message

The lesson mentions 'structured extraction from semi-clean docs.' What type of document would be most suitable for Haiku?

A scanned PDF with complex tables and merged cells
A handwritten letter with scribbles and cross-outs
A heavily redacted legal document with blacked-out sections
A completed web form with consistent fields

What is the primary reason to use Haiku for 'tool-call decisions in multi-step agents'?

Haiku has special tool-calling features other models lack
Agents need to make many quick decisions and Haiku's low latency keeps the agent responsive
Tool calls require high-quality reasoning only Sonnet provides
Tool calls are expensive and should be minimized

Based on the lesson, if a developer's priority is minimum time-to-first-token, which model should they choose?

Claude Sonnet 4.6
Claude Opus
Claude Haiku 4.5
Any Claude model will have similar latency

What does the lesson imply about using Haiku for 'autocomplete-style suggestions'?

It requires additional human review for every suggestion
It should be avoided because quality is too low
It is more expensive than other use cases
It works well due to low latency and can return suggestions under 200ms

A startup is building their first AI product with a limited budget. Why might the lesson suggest starting with Haiku?

Sonnet is too expensive for any startup to use
Haiku has better documentation for beginners
Haiku is the only model that works for startups
Starting with Haiku reduces costs during development and lets them scale affordably

The lesson states Haiku can handle 80% of work while cutting costs. What happens to the remaining 20% of requests?

They are queued for batch processing
They are rejected
They are handled by Sonnet after escalation
They are processed by a different AI provider

Why might a company choose NOT to use Haiku for long document summarization?

Haiku is too expensive for summarization
Haiku has lower token limits for input
Haiku does not support summarization
Long documents require reasoning and quality chat, which Sonnet handles better

Claude Haiku 4.5 — speed/cost analysis

The forgotten tier

Where Haiku shines

End-of-lesson check

Claude Haiku 4.5 — speed/cost analysis

The forgotten tier

Where Haiku shines

End-of-lesson check