Python async/await — Waiting Without Blocking

Async lets your program make 100 API calls at once instead of one at a time. Essential for LLM apps. You'll write the two patterns that solve 90% of cases.

60 min · Reviewed 2026

Why async matters for AI code

Every call to Claude or GPT takes 1–10 seconds. If your app makes 50 calls sequentially, it takes minutes. With asyncio, you fire all 50 at once and wait only as long as the slowest one. This single pattern is the difference between a toy script and a production app.

import asyncio
import httpx

async def fetch(client: httpx.AsyncClient, url: str) -> dict:
    response = await client.get(url, timeout=10)
    response.raise_for_status()
    return response.json()

async def main() -> None:
    urls = [
        "https://api.github.com/users/anthropic",
        "https://api.github.com/users/openai",
        "https://api.github.com/users/vercel",
    ]
    async with httpx.AsyncClient() as client:
        results = await asyncio.gather(*(fetch(client, u) for u in urls))
    for data in results:
        print(data["login"], "-", data.get("bio", ""))

asyncio.run(main())asyncio.gather runs all coroutines concurrently. Three API calls, one total wait.

The two words: async and await

async def creates a coroutine function — calling it returns a coroutine object, it does not run yet
await pauses until a coroutine finishes, freeing the event loop to run others
asyncio.run starts the event loop; asyncio.gather runs many coroutines at once

Concurrency with rate limiting (the real world)

import asyncio
from anthropic import AsyncAnthropic

client = AsyncAnthropic()
semaphore = asyncio.Semaphore(5)   # max 5 concurrent requests

async def summarize(text: str) -> str:
    async with semaphore:
        response = await client.messages.create(
            model="claude-opus-4-7",
            max_tokens=200,
            messages=[{"role": "user", "content": f"One sentence summary:\n{text}"}],
        )
        return response.content[0].text

async def main():
    articles = ["...text 1...", "...text 2...", "...text 20..."]
    summaries = await asyncio.gather(*(summarize(a) for a in articles))
    for s in summaries:
        print(s)

asyncio.run(main())Semaphore caps concurrent requests so you don't hit the provider's rate limit.

Error handling across many coroutines

results = await asyncio.gather(
    *(fetch(client, u) for u in urls),
    return_exceptions=True,   # don't fail-fast
)
for url, result in zip(urls, results):
    if isinstance(result, Exception):
        print(f"Failed {url}: {result}")
    else:
        print(f"OK {url}")return_exceptions=True collects failures instead of aborting the whole batch — crucial for LLM calls.

Mini-exercise

Write an async function that calls the Anthropic API with a prompt
Call it concurrently on a list of 10 prompts using gather
Add a semaphore limiting to 3 concurrent calls
Measure total time with time.perf_counter — should be ~3x faster than sequential

Sync	Async
10 API calls = 10 × 2s = 20s	10 API calls ≈ 2s (all at once)
Easy to reason about	Requires thinking in coroutines
Good for: scripts, simple tools	Good for: servers, LLM apps, scrapers

Big idea: most AI code spends 99% of its time waiting. Async is how you stop waiting in series and start waiting in parallel.

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-prog-python-async-creators

What is the core idea behind "Python async/await — Waiting Without Blocking"?
1. Async lets your program make 100 API calls at once instead of one at a time. Essential for LLM apps. You'll write the two patterns that solve 90% of cases.
2. idempotency
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
Which term best describes a foundational idea in "Python async/await — Waiting Without Blocking"?
1. await
2. async
3. coroutine
4. event loop
A learner studying Python async/await — Waiting Without Blocking would need to understand which concept?
1. async
2. coroutine
3. await
4. event loop
Which of these is directly relevant to Python async/await — Waiting Without Blocking?
1. async
2. await
3. event loop
4. coroutine
Which of the following is a key point about Python async/await — Waiting Without Blocking?
1. async def creates a coroutine function — calling it returns a coroutine object, it does not run yet
2. await pauses until a coroutine finishes, freeing the event loop to run others
3. asyncio.run starts the event loop; asyncio.gather runs many coroutines at once
4. idempotency
What is one important takeaway from studying Python async/await — Waiting Without Blocking?
1. Call it concurrently on a list of 10 prompts using gather
2. Write an async function that calls the Anthropic API with a prompt
3. Add a semaphore limiting to 3 concurrent calls
4. Measure total time with time.perf_counter — should be ~3x faster than sequential
Which of these does NOT belong in a discussion of Python async/await — Waiting Without Blocking?
1. Write an async function that calls the Anthropic API with a prompt
2. Add a semaphore limiting to 3 concurrent calls
3. Call it concurrently on a list of 10 prompts using gather
4. idempotency
What is the key insight about "The sync/async trap" in the context of Python async/await — Waiting Without Blocking?
1. idempotency
2. Have your AI assistant draft both — then check if it handled ties and missing ke…
3. agent loop
4. You cannot call an async function from sync code without asyncio.run.
What is the recommended tip about "Always review AI output" in the context of Python async/await — Waiting Without Blocking?
1. AI-generated code can hallucinate APIs, miss edge cases, or introduce subtle bugs.
2. idempotency
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
Which statement accurately describes an aspect of Python async/await — Waiting Without Blocking?
1. idempotency
2. Every call to Claude or GPT takes 1–10 seconds. If your app makes 50 calls sequentially, it takes minutes.
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
What does working with Python async/await — Waiting Without Blocking typically involve?
1. idempotency
2. Have your AI assistant draft both — then check if it handled ties and missing ke…
3. Big idea: most AI code spends 99% of its time waiting. Async is how you stop waiting in series and start waiting in parallel.
4. agent loop
Which best describes the scope of "Python async/await — Waiting Without Blocking"?
1. It is unrelated to ai-coding workflows
2. It applies only to the opposite beginner tier
3. It was deprecated in 2024 and no longer relevant
4. It focuses on Async lets your program make 100 API calls at once instead of one at a time. Essential for LLM apps.
Which section heading best belongs in a lesson about Python async/await — Waiting Without Blocking?
1. The two words: async and await
2. idempotency
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
Which section heading best belongs in a lesson about Python async/await — Waiting Without Blocking?
1. idempotency
2. Concurrency with rate limiting (the real world)
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
Which section heading best belongs in a lesson about Python async/await — Waiting Without Blocking?
1. idempotency
2. Have your AI assistant draft both — then check if it handled ties and missing ke…
3. Error handling across many coroutines
4. agent loop

← Back to interactive lesson

Tendril · Creators · AI-Assisted Coding

Python async/await — Waiting Without Blocking

Async lets your program make 100 API calls at once instead of one at a time. Essential for LLM apps. You'll write the two patterns that solve 90% of cases.

60 min · Reviewed 2026

Why async matters for AI code

import asyncio
import httpx

async def fetch(client: httpx.AsyncClient, url: str) -> dict:
    response = await client.get(url, timeout=10)
    response.raise_for_status()
    return response.json()

async def main() -> None:
    urls = [
        "https://api.github.com/users/anthropic",
        "https://api.github.com/users/openai",
        "https://api.github.com/users/vercel",
    ]
    async with httpx.AsyncClient() as client:
        results = await asyncio.gather(*(fetch(client, u) for u in urls))
    for data in results:
        print(data["login"], "-", data.get("bio", ""))

asyncio.run(main())asyncio.gather runs all coroutines concurrently. Three API calls, one total wait.

The two words: async and await

async def creates a coroutine function — calling it returns a coroutine object, it does not run yet
await pauses until a coroutine finishes, freeing the event loop to run others
asyncio.run starts the event loop; asyncio.gather runs many coroutines at once

Concurrency with rate limiting (the real world)

import asyncio
from anthropic import AsyncAnthropic

client = AsyncAnthropic()
semaphore = asyncio.Semaphore(5)   # max 5 concurrent requests

async def summarize(text: str) -> str:
    async with semaphore:
        response = await client.messages.create(
            model="claude-opus-4-7",
            max_tokens=200,
            messages=[{"role": "user", "content": f"One sentence summary:\n{text}"}],
        )
        return response.content[0].text

async def main():
    articles = ["...text 1...", "...text 2...", "...text 20..."]
    summaries = await asyncio.gather(*(summarize(a) for a in articles))
    for s in summaries:
        print(s)

asyncio.run(main())Semaphore caps concurrent requests so you don't hit the provider's rate limit.

Error handling across many coroutines

results = await asyncio.gather(
    *(fetch(client, u) for u in urls),
    return_exceptions=True,   # don't fail-fast
)
for url, result in zip(urls, results):
    if isinstance(result, Exception):
        print(f"Failed {url}: {result}")
    else:
        print(f"OK {url}")return_exceptions=True collects failures instead of aborting the whole batch — crucial for LLM calls.

Mini-exercise

Write an async function that calls the Anthropic API with a prompt
Call it concurrently on a list of 10 prompts using gather
Add a semaphore limiting to 3 concurrent calls
Measure total time with time.perf_counter — should be ~3x faster than sequential

Sync	Async
10 API calls = 10 × 2s = 20s	10 API calls ≈ 2s (all at once)
Easy to reason about	Requires thinking in coroutines
Good for: scripts, simple tools	Good for: servers, LLM apps, scrapers

Big idea: most AI code spends 99% of its time waiting. Async is how you stop waiting in series and start waiting in parallel.

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-prog-python-async-creators

What is the core idea behind "Python async/await — Waiting Without Blocking"?
1. Async lets your program make 100 API calls at once instead of one at a time. Essential for LLM apps. You'll write the two patterns that solve 90% of cases.
2. idempotency
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
Which term best describes a foundational idea in "Python async/await — Waiting Without Blocking"?
1. await
2. async
3. coroutine
4. event loop
A learner studying Python async/await — Waiting Without Blocking would need to understand which concept?
1. async
2. coroutine
3. await
4. event loop
Which of these is directly relevant to Python async/await — Waiting Without Blocking?
1. async
2. await
3. event loop
4. coroutine
Which of the following is a key point about Python async/await — Waiting Without Blocking?
1. async def creates a coroutine function — calling it returns a coroutine object, it does not run yet
2. await pauses until a coroutine finishes, freeing the event loop to run others
3. asyncio.run starts the event loop; asyncio.gather runs many coroutines at once
4. idempotency
What is one important takeaway from studying Python async/await — Waiting Without Blocking?
1. Call it concurrently on a list of 10 prompts using gather
2. Write an async function that calls the Anthropic API with a prompt
3. Add a semaphore limiting to 3 concurrent calls
4. Measure total time with time.perf_counter — should be ~3x faster than sequential
Which of these does NOT belong in a discussion of Python async/await — Waiting Without Blocking?
1. Write an async function that calls the Anthropic API with a prompt
2. Add a semaphore limiting to 3 concurrent calls
3. Call it concurrently on a list of 10 prompts using gather
4. idempotency
What is the key insight about "The sync/async trap" in the context of Python async/await — Waiting Without Blocking?
1. idempotency
2. Have your AI assistant draft both — then check if it handled ties and missing ke…
3. agent loop
4. You cannot call an async function from sync code without asyncio.run.
What is the recommended tip about "Always review AI output" in the context of Python async/await — Waiting Without Blocking?
1. AI-generated code can hallucinate APIs, miss edge cases, or introduce subtle bugs.
2. idempotency
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
Which statement accurately describes an aspect of Python async/await — Waiting Without Blocking?
1. idempotency
2. Every call to Claude or GPT takes 1–10 seconds. If your app makes 50 calls sequentially, it takes minutes.
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
What does working with Python async/await — Waiting Without Blocking typically involve?
1. idempotency
2. Have your AI assistant draft both — then check if it handled ties and missing ke…
3. Big idea: most AI code spends 99% of its time waiting. Async is how you stop waiting in series and start waiting in parallel.
4. agent loop
Which best describes the scope of "Python async/await — Waiting Without Blocking"?
1. It is unrelated to ai-coding workflows
2. It applies only to the opposite beginner tier
3. It was deprecated in 2024 and no longer relevant
4. It focuses on Async lets your program make 100 API calls at once instead of one at a time. Essential for LLM apps.
Which section heading best belongs in a lesson about Python async/await — Waiting Without Blocking?
1. The two words: async and await
2. idempotency
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
Which section heading best belongs in a lesson about Python async/await — Waiting Without Blocking?
1. idempotency
2. Concurrency with rate limiting (the real world)
3. Have your AI assistant draft both — then check if it handled ties and missing ke…
4. agent loop
Which section heading best belongs in a lesson about Python async/await — Waiting Without Blocking?
1. idempotency
2. Have your AI assistant draft both — then check if it handled ties and missing ke…
3. Error handling across many coroutines
4. agent loop

← Back to interactive lesson