Tendril

Lesson 60 of 2116

Video Generation at the API Level

Behind the glossy UIs, video models expose REST APIs. Here's how to call Sora, Veo, and Runway programmatically and build production pipelines.

CreatorsCreative AI~24 min readAdvancedProfessionalCoderDesignerOperationsBI2 · Representation & ReasoningBI3 · LearningPrint / PDF

Lesson map

What this lesson covers

40 min22 blocks5 concepts

Learning path

The main moves in order

1Video generation is inherently async
2Sora API
3Veo API
4Runway API

Concept cluster

Terms to connect while reading

Sora APIVeo APIRunway APIvideo pipelineasynchronous generation

Sections8

Lists2

Notes4

Code4

Compare1

Section 1

Video generation is inherently async

A single video generation call takes 30 seconds to 5 minutes. Every reputable API is asynchronous: POST a job, get a task ID, poll (or listen on a webhook) for completion, download the result. Build this pattern into your product from day one — latency retrofits are expensive.

Sora 2 API (OpenAI)

OpenAI released Sora 2 via API in late 2025. Note: OpenAI announced April 2026 it will discontinue the Sora web/app experiences, with API discontinuation in September 2026 — so Sora is a short-term option, not a long-term bet. Structure code behind a provider interface.

Sora 2 via OpenAI Python SDK. Simple polling loop.

python

from openai import OpenAI
import time

client = OpenAI()

job = client.videos.generate(
    model="sora-2",
    prompt="A chef ladles broth into a ramen bowl, steam rising, 35mm film look. Camera dollies in slowly.",
    duration=10,           # seconds
    resolution="1080p",
    aspect_ratio="16:9",
)

# Poll for completion
while True:
    status = client.videos.retrieve(job.id)
    if status.status == "completed":
        video_url = status.video_url
        break
    elif status.status == "failed":
        raise RuntimeError(status.error)
    time.sleep(5)

# Download and store
import urllib.request
urllib.request.urlretrieve(video_url, "./ramen.mp4")

Check-in 1. Got it so far?

Veo 3.1 API (Google Vertex AI)

Veo 3.1 via Vertex AI. Uses long-running operations pattern.

python

from google.cloud import aiplatform
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value

aiplatform.init(project="my-project", location="us-central1")

client = aiplatform.gapic.PredictionServiceClient()
endpoint = client.endpoint_path(
    project="my-project",
    location="us-central1",
    endpoint="publishers/google/models/veo-3.1-generate-001",
)

instance = json_format.ParseDict({
    "prompt": "Cinematic drone shot over rice terraces in Bali at sunrise",
    "duration_seconds": 8,
    "resolution": "1080p",
    "aspect_ratio": "16:9",
    "generate_audio": True,
}, Value())

# Long-running operation
operation = client.predict_long_running(
    endpoint=endpoint,
    instances=[instance],
)
result = operation.result(timeout=600)
video_bytes = result.predictions[0]["video_bytes"]

Runway Gen-4.5 API

Runway Gen-4 image-to-video — the most reliable quality path.

python

import requests
import time

RUNWAY_KEY = "rwa_..."
headers = {"Authorization": f"Bearer {RUNWAY_KEY}", "Content-Type": "application/json"}

# Image-to-video (most consistent quality path)
resp = requests.post(
    "https://api.dev.runwayml.com/v1/image_to_video",
    headers=headers,
    json={
        "promptImage": "https://cdn.example.com/hero-frame.png",
        "promptText": "Camera slowly pulls back to reveal the full landscape",
        "model": "gen4_turbo",
        "duration": 10,
        "ratio": "1280:720",
    },
)
task_id = resp.json()["id"]

while True:
    status = requests.get(
        f"https://api.dev.runwayml.com/v1/tasks/{task_id}", headers=headers
    ).json()
    if status["status"] == "SUCCEEDED":
        video_url = status["output"][0]
        break
    time.sleep(5)

Production pipeline pattern

1Accept a job from a user. Store in a queue (Redis, SQS).
2Worker picks up the job, calls the video API, stores the task_id.
3Webhook handler or polling worker updates job status.
4On success, download video to object storage (S3/R2).
5Notify user (email, websocket, push).
6Periodic cleanup — providers keep result URLs for hours/days, not forever.

Check-in 2. Got it so far?

Provider ergonomics compared

Compare the options

Provider	Submit pattern	Polling / webhook	Output format
Sora 2 (OpenAI)	client.videos.generate() — sync-style SDK.	client.videos.retrieve(id) polling.	Signed URL, MP4.
Veo 3.1 (Vertex AI)	client.predict_long_running() — LRO.	operation.result() with timeout.	Video bytes or GCS URI.
Runway Gen-4.5	POST /image_to_video or /text_to_video.	GET /tasks/{id} polling.	Hosted URL (hours TTL).
Kling 3.0	POST with signed auth; token-based.	Polling; webhook on enterprise.	Hosted URL + C2PA metadata.

Provider abstraction layer

Abstract behind a provider interface. Video models consolidate fast.

python

from abc import ABC, abstractmethod
from dataclasses import dataclass

@dataclass
class VideoJob:
    prompt: str
    duration: int
    resolution: str
    ref_image_url: str | None = None

class VideoProvider(ABC):
    @abstractmethod
    async def submit(self, job: VideoJob) -> str: ...  # returns task_id
    @abstractmethod
    async def status(self, task_id: str) -> dict: ...
    @abstractmethod
    async def download(self, task_id: str) -> bytes: ...

class SoraProvider(VideoProvider): ...
class VeoProvider(VideoProvider): ...
class RunwayProvider(VideoProvider): ...
class KlingProvider(VideoProvider): ...

def pick_provider(job: VideoJob, policy: str) -> VideoProvider:
    if policy == "cheapest_4k":
        return KlingProvider()
    if policy == "best_physics":
        return VeoProvider()
    if policy == "best_character_consistency":
        return RunwayProvider()
    return VeoProvider()  # sensible default

Check-in 3. Got it so far?

Quality-control pipeline

1Generate 3-5 candidates per shot in parallel.
2Run each through a lightweight classifier (CLIP similarity to prompt; motion analysis; face-detection stability).
3Pick top candidate automatically; queue lower candidates for human review.
4On human rejection, feed feedback back as a refined prompt (closed-loop).

Key terms in this lesson

Check-in 4. Got it so far?

End-of-lesson quiz

Check what stuck

15 questions · Score saves to your progress.

Tutor

Curious about “Video Generation at the API Level”?

Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.

Progress saved locally in this browser. Sign in to sync across devices.

Related lessons

Video Generation at the API Level

Video generation is inherently async

Sora 2 API (OpenAI)

Veo 3.1 API (Google Vertex AI)

Runway Gen-4.5 API

Production pipeline pattern

Provider ergonomics compared

Provider abstraction layer

Quality-control pipeline

Curious about “Video Generation at the API Level”?

Keep going

Video Generation at the API Level

Video generation is inherently async

Sora 2 API (OpenAI)

Veo 3.1 API (Google Vertex AI)

Runway Gen-4.5 API

Production pipeline pattern

Provider ergonomics compared

Provider abstraction layer

Quality-control pipeline

Curious about “Video Generation at the API Level”?

Keep going