Loading lesson…
The Responses API is OpenAI's modern surface. One call, text and tools. Learn the shape you'll use most.
OpenAI ships chat.completions (classic) and responses (modern). New code should prefer responses — it unifies text, tools, and structured output.
from openai import OpenAI
client = OpenAI()
def ask(prompt: str) -> str:
try:
r = client.responses.create(
model="gpt-5",
input=[
{"role": "system", "content": "Be concise."},
{"role": "user", "content": prompt},
],
)
return r.output_text
except Exception as e:
print(f"OpenAI call failed: {e}")
raise
print(ask("Explain recursion in one sentence."))output_text is a convenience accessor that concatenates all text in the response.def ask_stream(prompt: str) -> None:
with client.responses.stream(
model="gpt-5",
input=[{"role": "user", "content": prompt}],
) as stream:
for event in stream:
if event.type == "response.output_text.delta":
print(event.delta, end="", flush=True)
stream.until_done()
print()Context manager ensures the stream closes. Event types are strings — filter for the text delta.Understanding "Calling the OpenAI API" in practice: AI-assisted coding shifts work from syntax recall to design thinking — models handle boilerplate so you focus on architecture. The Responses API is OpenAI's modern surface. One call, text and tools. Learn the shape you'll use most — and knowing how to apply this gives you a concrete advantage.
The big idea: responses.create for the modern path, stream for UIs, and centralize model ids so provider swaps are painless.
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-progx-openai-api-creators
A team has hard-coded the model ID 'gpt-4o' in 15 different source files. What problem does this create when OpenAI releases a better model?
Your application receives a 429 (rate limit) response from the OpenAI API. What strategy should you implement to handle this correctly?
What functionality does the `.with_raw_response` method provide in the OpenAI Python SDK?
In the context of the OpenAI API, what is streaming primarily used for?
What function is used to create a request in the modern Responses API?
A developer stores the model ID in a single config constant instead of hard-coding it throughout the application. What is the primary benefit of this approach?
What types of output does the Responses API unify in a single call?
What is exponential backoff in the context of API error handling?
When should developers prefer the Responses API over chat.completions?
What does the `output_text` property contain in an OpenAI API response?
Why is streaming particularly beneficial for chat applications with user interfaces?
What happens if an application repeatedly ignores 429 responses without implementing backoff?
What is the advantage of using `.with_raw_response` combined with a retry library like tenacity?
What does a 429 HTTP status code indicate when received from an API?
For a real-time chat application where users expect instant feedback, which API feature is most important to implement?