Loading lesson…
Behind the glossy UIs, video models expose REST APIs. Here's how to call Sora, Veo, and Runway programmatically and build production pipelines.
A single video generation call takes 30 seconds to 5 minutes. Every reputable API is asynchronous: POST a job, get a task ID, poll (or listen on a webhook) for completion, download the result. Build this pattern into your product from day one — latency retrofits are expensive.
OpenAI released Sora 2 via API in late 2025. Note: OpenAI announced April 2026 it will discontinue the Sora web/app experiences, with API discontinuation in September 2026 — so Sora is a short-term option, not a long-term bet. Structure code behind a provider interface.
from openai import OpenAI
import time
client = OpenAI()
job = client.videos.generate(
model="sora-2",
prompt="A chef ladles broth into a ramen bowl, steam rising, 35mm film look. Camera dollies in slowly.",
duration=10, # seconds
resolution="1080p",
aspect_ratio="16:9",
)
# Poll for completion
while True:
status = client.videos.retrieve(job.id)
if status.status == "completed":
video_url = status.video_url
break
elif status.status == "failed":
raise RuntimeError(status.error)
time.sleep(5)
# Download and store
import urllib.request
urllib.request.urlretrieve(video_url, "./ramen.mp4")Sora 2 via OpenAI Python SDK. Simple polling loop.from google.cloud import aiplatform
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value
aiplatform.init(project="my-project", location="us-central1")
client = aiplatform.gapic.PredictionServiceClient()
endpoint = client.endpoint_path(
project="my-project",
location="us-central1",
endpoint="publishers/google/models/veo-3.1-generate-001",
)
instance = json_format.ParseDict({
"prompt": "Cinematic drone shot over rice terraces in Bali at sunrise",
"duration_seconds": 8,
"resolution": "1080p",
"aspect_ratio": "16:9",
"generate_audio": True,
}, Value())
# Long-running operation
operation = client.predict_long_running(
endpoint=endpoint,
instances=[instance],
)
result = operation.result(timeout=600)
video_bytes = result.predictions[0]["video_bytes"]Veo 3.1 via Vertex AI. Uses long-running operations pattern.import requests
import time
RUNWAY_KEY = "rwa_..."
headers = {"Authorization": f"Bearer {RUNWAY_KEY}", "Content-Type": "application/json"}
# Image-to-video (most consistent quality path)
resp = requests.post(
"https://api.dev.runwayml.com/v1/image_to_video",
headers=headers,
json={
"promptImage": "https://cdn.example.com/hero-frame.png",
"promptText": "Camera slowly pulls back to reveal the full landscape",
"model": "gen4_turbo",
"duration": 10,
"ratio": "1280:720",
},
)
task_id = resp.json()["id"]
while True:
status = requests.get(
f"https://api.dev.runwayml.com/v1/tasks/{task_id}", headers=headers
).json()
if status["status"] == "SUCCEEDED":
video_url = status["output"][0]
break
time.sleep(5)Runway Gen-4 image-to-video — the most reliable quality path.| Provider | Submit pattern | Polling / webhook | Output format |
|---|---|---|---|
| Sora 2 (OpenAI) | client.videos.generate() — sync-style SDK. | client.videos.retrieve(id) polling. | Signed URL, MP4. |
| Veo 3.1 (Vertex AI) | client.predict_long_running() — LRO. | operation.result() with timeout. | Video bytes or GCS URI. |
| Runway Gen-4.5 | POST /image_to_video or /text_to_video. | GET /tasks/{id} polling. | Hosted URL (hours TTL). |
| Kling 3.0 | POST with signed auth; token-based. | Polling; webhook on enterprise. | Hosted URL + C2PA metadata. |
from abc import ABC, abstractmethod
from dataclasses import dataclass
@dataclass
class VideoJob:
prompt: str
duration: int
resolution: str
ref_image_url: str | None = None
class VideoProvider(ABC):
@abstractmethod
async def submit(self, job: VideoJob) -> str: ... # returns task_id
@abstractmethod
async def status(self, task_id: str) -> dict: ...
@abstractmethod
async def download(self, task_id: str) -> bytes: ...
class SoraProvider(VideoProvider): ...
class VeoProvider(VideoProvider): ...
class RunwayProvider(VideoProvider): ...
class KlingProvider(VideoProvider): ...
def pick_provider(job: VideoJob, policy: str) -> VideoProvider:
if policy == "cheapest_4k":
return KlingProvider()
if policy == "best_physics":
return VeoProvider()
if policy == "best_character_consistency":
return RunwayProvider()
return VeoProvider() # sensible defaultAbstract behind a provider interface. Video models consolidate fast.15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-creative-video-api-creators
What is the core idea behind "Video Generation at the API Level"?
Which term best describes a foundational idea in "Video Generation at the API Level"?
A learner studying Video Generation at the API Level would need to understand which concept?
Which of these is directly relevant to Video Generation at the API Level?
Which of the following is a key point about Video Generation at the API Level?
Which of these does NOT belong in a discussion of Video Generation at the API Level?
Which statement is accurate regarding Video Generation at the API Level?
Which of these does NOT belong in a discussion of Video Generation at the API Level?
What is the key insight about "Budget and rate limits" in the context of Video Generation at the API Level?
What is the key insight about "C2PA provenance on output" in the context of Video Generation at the API Level?
What is the recommended tip about "Use AI as a co-creator" in the context of Video Generation at the API Level?
Which statement accurately describes an aspect of Video Generation at the API Level?
What does working with Video Generation at the API Level typically involve?
Which best describes the scope of "Video Generation at the API Level"?
Which section heading best belongs in a lesson about Video Generation at the API Level?