Video Generation at the API Level

Behind the glossy UIs, video models expose REST APIs. Here's how to call Sora, Veo, and Runway programmatically and build production pipelines.

40 min · Reviewed 2026

Video generation is inherently async

A single video generation call takes 30 seconds to 5 minutes. Every reputable API is asynchronous: POST a job, get a task ID, poll (or listen on a webhook) for completion, download the result. Build this pattern into your product from day one — latency retrofits are expensive.

Sora 2 API (OpenAI)

OpenAI released Sora 2 via API in late 2025. Note: OpenAI announced April 2026 it will discontinue the Sora web/app experiences, with API discontinuation in September 2026 — so Sora is a short-term option, not a long-term bet. Structure code behind a provider interface.

from openai import OpenAI
import time

client = OpenAI()

job = client.videos.generate(
    model="sora-2",
    prompt="A chef ladles broth into a ramen bowl, steam rising, 35mm film look. Camera dollies in slowly.",
    duration=10,           # seconds
    resolution="1080p",
    aspect_ratio="16:9",
)

# Poll for completion
while True:
    status = client.videos.retrieve(job.id)
    if status.status == "completed":
        video_url = status.video_url
        break
    elif status.status == "failed":
        raise RuntimeError(status.error)
    time.sleep(5)

# Download and store
import urllib.request
urllib.request.urlretrieve(video_url, "./ramen.mp4")Sora 2 via OpenAI Python SDK. Simple polling loop.

Veo 3.1 API (Google Vertex AI)

from google.cloud import aiplatform
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value

aiplatform.init(project="my-project", location="us-central1")

client = aiplatform.gapic.PredictionServiceClient()
endpoint = client.endpoint_path(
    project="my-project",
    location="us-central1",
    endpoint="publishers/google/models/veo-3.1-generate-001",
)

instance = json_format.ParseDict({
    "prompt": "Cinematic drone shot over rice terraces in Bali at sunrise",
    "duration_seconds": 8,
    "resolution": "1080p",
    "aspect_ratio": "16:9",
    "generate_audio": True,
}, Value())

# Long-running operation
operation = client.predict_long_running(
    endpoint=endpoint,
    instances=[instance],
)
result = operation.result(timeout=600)
video_bytes = result.predictions[0]["video_bytes"]Veo 3.1 via Vertex AI. Uses long-running operations pattern.

Runway Gen-4.5 API

import requests
import time

RUNWAY_KEY = "rwa_..."
headers = {"Authorization": f"Bearer {RUNWAY_KEY}", "Content-Type": "application/json"}

# Image-to-video (most consistent quality path)
resp = requests.post(
    "https://api.dev.runwayml.com/v1/image_to_video",
    headers=headers,
    json={
        "promptImage": "https://cdn.example.com/hero-frame.png",
        "promptText": "Camera slowly pulls back to reveal the full landscape",
        "model": "gen4_turbo",
        "duration": 10,
        "ratio": "1280:720",
    },
)
task_id = resp.json()["id"]

while True:
    status = requests.get(
        f"https://api.dev.runwayml.com/v1/tasks/{task_id}", headers=headers
    ).json()
    if status["status"] == "SUCCEEDED":
        video_url = status["output"][0]
        break
    time.sleep(5)Runway Gen-4 image-to-video — the most reliable quality path.

Production pipeline pattern

Accept a job from a user. Store in a queue (Redis, SQS).
Worker picks up the job, calls the video API, stores the task_id.
Webhook handler or polling worker updates job status.
On success, download video to object storage (S3/R2).
Notify user (email, websocket, push).
Periodic cleanup — providers keep result URLs for hours/days, not forever.

Provider ergonomics compared

Provider	Submit pattern	Polling / webhook	Output format
Sora 2 (OpenAI)	client.videos.generate() — sync-style SDK.	client.videos.retrieve(id) polling.	Signed URL, MP4.
Veo 3.1 (Vertex AI)	client.predict_long_running() — LRO.	operation.result() with timeout.	Video bytes or GCS URI.
Runway Gen-4.5	POST /image_to_video or /text_to_video.	GET /tasks/{id} polling.	Hosted URL (hours TTL).
Kling 3.0	POST with signed auth; token-based.	Polling; webhook on enterprise.	Hosted URL + C2PA metadata.

Provider abstraction layer

from abc import ABC, abstractmethod
from dataclasses import dataclass

@dataclass
class VideoJob:
    prompt: str
    duration: int
    resolution: str
    ref_image_url: str | None = None

class VideoProvider(ABC):
    @abstractmethod
    async def submit(self, job: VideoJob) -> str: ...  # returns task_id
    @abstractmethod
    async def status(self, task_id: str) -> dict: ...
    @abstractmethod
    async def download(self, task_id: str) -> bytes: ...

class SoraProvider(VideoProvider): ...
class VeoProvider(VideoProvider): ...
class RunwayProvider(VideoProvider): ...
class KlingProvider(VideoProvider): ...

def pick_provider(job: VideoJob, policy: str) -> VideoProvider:
    if policy == "cheapest_4k":
        return KlingProvider()
    if policy == "best_physics":
        return VeoProvider()
    if policy == "best_character_consistency":
        return RunwayProvider()
    return VeoProvider()  # sensible defaultAbstract behind a provider interface. Video models consolidate fast.

Quality-control pipeline

Generate 3-5 candidates per shot in parallel.
Run each through a lightweight classifier (CLIP similarity to prompt; motion analysis; face-detection stability).
Pick top candidate automatically; queue lower candidates for human review.
On human rejection, feed feedback back as a refined prompt (closed-loop).

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-creative-video-api-creators

What is the core idea behind "Video Generation at the API Level"?
1. Behind the glossy UIs, video models expose REST APIs. Here's how to call Sora, Veo, and Runway programmatically and build production pipelines.
2. Make every game successful
3. Replace on-set adaptation by the photographer
4. Apply AI tools in your creative workflow to get better results
Which term best describes a foundational idea in "Video Generation at the API Level"?
1. asynchronous generation
2. video API
3. provider abstraction
4. polling
A learner studying Video Generation at the API Level would need to understand which concept?
1. video API
2. provider abstraction
3. asynchronous generation
4. polling
Which of these is directly relevant to Video Generation at the API Level?
1. video API
2. asynchronous generation
3. polling
4. provider abstraction
Which of the following is a key point about Video Generation at the API Level?
1. Accept a job from a user. Store in a queue (Redis, SQS).
2. Worker picks up the job, calls the video API, stores the task_id.
3. Webhook handler or polling worker updates job status.
4. On success, download video to object storage (S3/R2).
Which of these does NOT belong in a discussion of Video Generation at the API Level?
1. Accept a job from a user. Store in a queue (Redis, SQS).
2. Worker picks up the job, calls the video API, stores the task_id.
3. Webhook handler or polling worker updates job status.
4. Make every game successful
Which statement is accurate regarding Video Generation at the API Level?
1. Run each through a lightweight classifier (CLIP similarity to prompt; motion analysis; face-detectio…
2. Pick top candidate automatically; queue lower candidates for human review.
3. Generate 3-5 candidates per shot in parallel.
4. On human rejection, feed feedback back as a refined prompt (closed-loop).
Which of these does NOT belong in a discussion of Video Generation at the API Level?
1. Make every game successful
2. Generate 3-5 candidates per shot in parallel.
3. Run each through a lightweight classifier (CLIP similarity to prompt; motion analysis; face-detectio…
4. Pick top candidate automatically; queue lower candidates for human review.
What is the key insight about "Budget and rate limits" in the context of Video Generation at the API Level?
1. Veo 3.1 enterprise tiers start in the low thousands per month. Runway API requires contract for production volume.
2. Make every game successful
3. Replace on-set adaptation by the photographer
4. Apply AI tools in your creative workflow to get better results
What is the key insight about "C2PA provenance on output" in the context of Video Generation at the API Level?
1. Make every game successful
2. Major providers now attach C2PA Content Credentials to outputs (Google Veo, Adobe, OpenAI to varying degrees).
3. Replace on-set adaptation by the photographer
4. Apply AI tools in your creative workflow to get better results
What is the recommended tip about "Use AI as a co-creator" in the context of Video Generation at the API Level?
1. Make every game successful
2. Replace on-set adaptation by the photographer
3. Set creative constraints before generating: tone, length, style reference, POV.
4. Apply AI tools in your creative workflow to get better results
Which statement accurately describes an aspect of Video Generation at the API Level?
1. Make every game successful
2. Replace on-set adaptation by the photographer
3. Apply AI tools in your creative workflow to get better results
4. A single video generation call takes 30 seconds to 5 minutes. Every reputable API is asynchronous: POST a job, get a task ID, poll (or liste…
What does working with Video Generation at the API Level typically involve?
1. OpenAI released Sora 2 via API in late 2025. Note: OpenAI announced April 2026 it will discontinue the Sora web/app experiences, with API di…
2. Make every game successful
3. Replace on-set adaptation by the photographer
4. Apply AI tools in your creative workflow to get better results
Which best describes the scope of "Video Generation at the API Level"?
1. It is unrelated to creative workflows
2. It focuses on Behind the glossy UIs, video models expose REST APIs. Here's how to call Sora, Veo, and Runway progr
3. It applies only to the opposite beginner tier
4. It was deprecated in 2024 and no longer relevant
Which section heading best belongs in a lesson about Video Generation at the API Level?
1. Make every game successful
2. Replace on-set adaptation by the photographer
3. Sora 2 API (OpenAI)
4. Apply AI tools in your creative workflow to get better results

← Back to interactive lesson

Tendril · Creators · Creative AI