Python async/await — Waiting Without Blocking

Section 1

Why async matters for AI code

asyncio.gather runs all coroutines concurrently. Three API calls, one total wait.

python

import asyncio
import httpx

async def fetch(client: httpx.AsyncClient, url: str) -> dict:
    response = await client.get(url, timeout=10)
    response.raise_for_status()
    return response.json()

async def main() -> None:
    urls = [
        "https://api.github.com/users/anthropic",
        "https://api.github.com/users/openai",
        "https://api.github.com/users/vercel",
    ]
    async with httpx.AsyncClient() as client:
        results = await asyncio.gather(*(fetch(client, u) for u in urls))
    for data in results:
        print(data["login"], "-", data.get("bio", ""))

asyncio.run(main())

Semaphore caps concurrent requests so you don't hit the provider's rate limit.

python

import asyncio
from anthropic import AsyncAnthropic

client = AsyncAnthropic()
semaphore = asyncio.Semaphore(5)   # max 5 concurrent requests

async def summarize(text: str) -> str:
    async with semaphore:
        response = await client.messages.create(
            model="claude-opus-4-7",
            max_tokens=200,
            messages=[{"role": "user", "content": f"One sentence summary:\n{text}"}],
        )
        return response.content[0].text

async def main():
    articles = ["...text 1...", "...text 2...", "...text 20..."]
    summaries = await asyncio.gather(*(summarize(a) for a in articles))
    for s in summaries:
        print(s)

asyncio.run(main())

return_exceptions=True collects failures instead of aborting the whole batch — crucial for LLM calls.

python

results = await asyncio.gather(
    *(fetch(client, u) for u in urls),
    return_exceptions=True,   # don't fail-fast
)
for url, result in zip(urls, results):
    if isinstance(result, Exception):
        print(f"Failed {url}: {result}")
    else:
        print(f"OK {url}")

Compare the options

Sync	Async
10 API calls = 10 × 2s = 20s	10 API calls ≈ 2s (all at once)
Easy to reason about	Requires thinking in coroutines
Good for: scripts, simple tools	Good for: servers, LLM apps, scrapers

Key terms in this lesson

Python async/await — Waiting Without Blocking

Why async matters for AI code

The two words: async and await

Concurrency with rate limiting (the real world)

Error handling across many coroutines

Mini-exercise

Curious about “Python async/await — Waiting Without Blocking”?

Keep going

Python async/await — Waiting Without Blocking

Why async matters for AI code

The two words: async and await

Concurrency with rate limiting (the real world)

Error handling across many coroutines

Mini-exercise

Curious about “Python async/await — Waiting Without Blocking”?

Keep going