Lesson 186 of 2116
Python async/await — Waiting Without Blocking
Async lets your program make 100 API calls at once instead of one at a time. Essential for LLM apps. You'll write the two patterns that solve 90% of cases.
Lesson map
What this lesson covers
Learning path
The main moves in order
- 1Why async matters for AI code
- 2async
- 3await
- 4asyncio
Concept cluster
Terms to connect while reading
Section 1
Why async matters for AI code
Every call to Claude or GPT takes 1–10 seconds. If your app makes 50 calls sequentially, it takes minutes. With asyncio, you fire all 50 at once and wait only as long as the slowest one. This single pattern is the difference between a toy script and a production app.
asyncio.gather runs all coroutines concurrently. Three API calls, one total wait.
import asyncio
import httpx
async def fetch(client: httpx.AsyncClient, url: str) -> dict:
response = await client.get(url, timeout=10)
response.raise_for_status()
return response.json()
async def main() -> None:
urls = [
"https://api.github.com/users/anthropic",
"https://api.github.com/users/openai",
"https://api.github.com/users/vercel",
]
async with httpx.AsyncClient() as client:
results = await asyncio.gather(*(fetch(client, u) for u in urls))
for data in results:
print(data["login"], "-", data.get("bio", ""))
asyncio.run(main())The two words: async and await
- async def creates a coroutine function — calling it returns a coroutine object, it does not run yet
- await pauses until a coroutine finishes, freeing the event loop to run others
- asyncio.run starts the event loop; asyncio.gather runs many coroutines at once
Concurrency with rate limiting (the real world)
Semaphore caps concurrent requests so you don't hit the provider's rate limit.
import asyncio
from anthropic import AsyncAnthropic
client = AsyncAnthropic()
semaphore = asyncio.Semaphore(5) # max 5 concurrent requests
async def summarize(text: str) -> str:
async with semaphore:
response = await client.messages.create(
model="claude-opus-4-7",
max_tokens=200,
messages=[{"role": "user", "content": f"One sentence summary:\n{text}"}],
)
return response.content[0].text
async def main():
articles = ["...text 1...", "...text 2...", "...text 20..."]
summaries = await asyncio.gather(*(summarize(a) for a in articles))
for s in summaries:
print(s)
asyncio.run(main())Error handling across many coroutines
return_exceptions=True collects failures instead of aborting the whole batch — crucial for LLM calls.
results = await asyncio.gather(
*(fetch(client, u) for u in urls),
return_exceptions=True, # don't fail-fast
)
for url, result in zip(urls, results):
if isinstance(result, Exception):
print(f"Failed {url}: {result}")
else:
print(f"OK {url}")Mini-exercise
- 1Write an async function that calls the Anthropic API with a prompt
- 2Call it concurrently on a list of 10 prompts using gather
- 3Add a semaphore limiting to 3 concurrent calls
- 4Measure total time with time.perf_counter — should be ~3x faster than sequential
Compare the options
| Sync | Async |
|---|---|
| 10 API calls = 10 × 2s = 20s | 10 API calls ≈ 2s (all at once) |
| Easy to reason about | Requires thinking in coroutines |
| Good for: scripts, simple tools | Good for: servers, LLM apps, scrapers |
Key terms in this lesson
Big idea: most AI code spends 99% of its time waiting. Async is how you stop waiting in series and start waiting in parallel.
End-of-lesson quiz
Check what stuck
15 questions · Score saves to your progress.
Tutor
Curious about “Python async/await — Waiting Without Blocking”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Creators · 45 min
Python Async With AI
async/await lets one program wait on many things at once. Perfect for HTTP calls and LLM APIs. Let AI help you avoid the common traps.
Creators · 40 min
FastAPI Minimal
FastAPI is Python's modern web framework. Type hints become schema. Docs auto-generate. Ship an API in 20 lines.
Creators · 50 min
The Landscape: Copilot vs. Cursor vs. Windsurf vs. Claude Code
The AI coding tool market fragmented fast. Let's map the 2026 landscape honestly: who is for autocomplete, who is for agents, who wins on cost, and what the tradeoffs actually feel like.
