Loading lesson…
Behind the glossy UIs, video models expose REST APIs. Here's how to call Sora, Veo, and Runway programmatically and build production pipelines.
A single video generation call takes 30 seconds to 5 minutes. Every reputable API is asynchronous: POST a job, get a task ID, poll (or listen on a webhook) for completion, download the result. Build this pattern into your product from day one — latency retrofits are expensive.
OpenAI released Sora 2 via API in late 2025. Note: OpenAI announced April 2026 it will discontinue the Sora web/app experiences, with API discontinuation in September 2026 — so Sora is a short-term option, not a long-term bet. Structure code behind a provider interface.
from openai import OpenAI import time client = OpenAI() job = client.videos.generate( model="sora-2", prompt="A chef ladles broth into a ramen bowl, steam rising, 35mm film look. Camera dollies in slowly.", duration=10, # seconds resolution="1080p", aspect_ratio="16:9", ) # Poll for completion while True: status = client.videos.retrieve(job.id) if status.status == "completed": video_url = status.video_url break elif status.status == "failed": raise RuntimeError(status.error) time.sleep(5) # Download and store import urllib.request urllib.request.urlretrieve(video_url, "./ramen.mp4")Sora 2 via OpenAI Python SDK. Simple polling loop.from google.cloud import aiplatform from google.protobuf import json_format from google.protobuf.struct_pb2 import Value aiplatform.init(project="my-project", location="us-central1") client = aiplatform.gapic.PredictionServiceClient() endpoint = client.endpoint_path( project="my-project", location="us-central1", endpoint="publishers/google/models/veo-3.1-generate-001", ) instance = json_format.ParseDict({ "prompt": "Cinematic drone shot over rice terraces in Bali at sunrise", "duration_seconds": 8, "resolution": "1080p", "aspect_ratio": "16:9", "generate_audio": True, }, Value()) # Long-running operation operation = client.predict_long_running( endpoint=endpoint, instances=[instance], ) result = operation.result(timeout=600) video_bytes = result.predictions[0]["video_bytes"]Veo 3.1 via Vertex AI. Uses long-running operations pattern.import requests import time RUNWAY_KEY = "rwa_" headers = {"Authorization": f"Bearer {RUNWAY_KEY}", "Content-Type": "application/json"} # Image-to-video (most consistent quality path) resp = requests.post( "https://api.dev.runwayml.com/v1/image_to_video", headers=headers, json={ "promptImage": "https://cdn.example.com/hero-frame.png", "promptText": "Camera slowly pulls back to reveal the full landscape", "model": "gen4_turbo", "duration": 10, "ratio": "1280:720", }, ) task_id = resp.json()["id"] while True: status = requests.get( f"https://api.dev.runwayml.com/v1/tasks/{task_id}", headers=headers ).json() if status["status"] == "SUCCEEDED": video_url = status["output"][0] break time.sleep(5)Runway Gen-4 image-to-video — the most reliable quality path.| Provider | Submit pattern | Polling / webhook | Output format |
|---|---|---|---|
| Sora 2 (OpenAI) | client.videos.generate() — sync-style SDK. | client.videos.retrieve(id) polling. | Signed URL, MP4. |
| Veo 3.1 (Vertex AI) | client.predict_long_running() — LRO. | operation.result() with timeout. | Video bytes or GCS URI. |
| Runway Gen-4.5 | POST /image_to_video or /text_to_video. | GET /tasks/{id} polling. | Hosted URL (hours TTL). |
| Kling 3.0 | POST with signed auth; token-based. | Polling; webhook on enterprise. | Hosted URL + C2PA metadata. |
from abc import ABC, abstractmethod from dataclasses import dataclass @dataclass class VideoJob: prompt: str duration: int resolution: str ref_image_url: str | None = None class VideoProvider(ABC): @abstractmethod async def submit(self, job: VideoJob) -> str: # returns task_id @abstractmethod async def status(self, task_id: str) -> dict: @abstractmethod async def download(self, task_id: str) -> bytes: class SoraProvider(VideoProvider): class VeoProvider(VideoProvider): class RunwayProvider(VideoProvider): class KlingProvider(VideoProvider): def pick_provider(job: VideoJob, policy: str) -> VideoProvider: if policy == "cheapest_4k": return KlingProvider() if policy == "best_physics": return VeoProvider() if policy == "best_character_consistency": return RunwayProvider() return VeoProvider() # sensible defaultAbstract behind a provider interface. Video models consolidate fast.8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-creative-video-api-creators
What is the main idea of "Video Generation at the API Level"?
Which concept is most central to "Video Generation at the API Level"?
Which use of AI fits this topic best?
What should a careful learner remember about "Budget and rate limits"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about Sora API be treated?
Name one way to verify an AI answer about Sora API.
Which action would help you apply "Video Generation at the API Level" responsibly?