Tendril

Lesson 52 of 1570

Video AI — Sora, Veo, Runway, Kling

Text-to-video became practical in 2025 and cinematic in 2026. Here's the state of the art and how to choose.

BuildersCreative AI~19 min readIntermediateDesignerBI2 · Representation & ReasoningBI3 · LearningBI5 · Societal ImpactPrint / PDF

Lesson map

What this lesson covers

32 min17 blocks5 concepts

Learning path

The main moves in order

1The video moment
2video generation
3Sora 2
4Veo 3

Concept cluster

Terms to connect while reading

video generationSora 2Veo 3Runway Gen-4Kling 3

Sections5

Lists2

Notes4

Code1

Compare1

Section 1

The video moment

Video generation went from 'jittery 4-second clips' in early 2024 to 'broadcast-grade 4K with synchronized dialogue and music' in early 2026. Four models lead. They're genuinely useful for pre-visualization, ads, b-roll, and short films — though full Hollywood-grade filmmaking is still human + AI, not AI alone.

Compare the options

Model	Best for	Max length / resolution	Audio?
OpenAI Sora 2	Cinematic physics, multi-subject scenes.	~20s, 1080p (upscalable).	Synced audio + dialogue.
Google Veo 3.1	Photorealism, audio quality, character dialog.	~60s, 1080p.	Best-in-class synced audio.
Runway Gen-4.5	Character consistency across scenes; pro editing.	~10s per shot; stitch in Runway.	Synced audio.
Kuaishou Kling 3.0	Native 4K / 60fps, longest clips (5 min), human motion.	5 min, 4K.	Synced audio.
Luma Dream Machine / Pika 2	Fast iteration, social-media clips, affordable.	~10s, 1080p.	Some models, newer.

Check-in 1. Got it so far?

Prompting video

Video prompts have two extra slots beyond image prompts: motion and camera.

A video prompt that specifies subject, setting, action, camera move, style, lighting, and duration.

text

A chef in a crowded Tokyo ramen shop gently ladles broth into a bowl. Steam rises. Camera slowly dollies in on her hands, then tilts up to her focused face. Shot on 35mm, shallow depth of field, warm practical lighting from paper lanterns. 8 seconds.

Image-to-video and ref-image workflows

Image-to-video: generate a still in Midjourney/Flux you love, then animate it in Runway or Kling.
Reference character: upload 3-5 images of a character to keep them consistent across shots (Runway Gen-4, Kling).
Keyframe: specify first and last frame; the model fills the motion between (Luma, Runway).

Check-in 2. Got it so far?

What breaks

Complex hand interactions (holding, typing) still glitch.
Long narratives — characters drift over 10+ seconds without explicit reference.
Physics in unusual scenarios (zero-g, underwater with many objects).
Fine text on signs/screens — still garbled most of the time.

Ethical considerations

Video deepfakes of real people are a serious concern. All major providers (OpenAI, Google, Runway, Kuaishou) refuse to generate named public figures without their explicit opt-in, and they watermark outputs (C2PA + SynthID for Google). If you're shipping a product, disclose AI origin and respect the TAKE IT DOWN Act (US) and EU AI Act labeling.

Check-in 3. Got it so far?

Key terms in this lesson

End-of-lesson quiz

Check what stuck

15 questions · Score saves to your progress.

Tutor

Curious about “Video AI — Sora, Veo, Runway, Kling”?

Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.

Progress saved locally in this browser. Sign in to sync across devices.

Related lessons

Video AI — Sora, Veo, Runway, Kling

The video moment

Prompting video

Image-to-video and ref-image workflows

What breaks

Ethical considerations

Curious about “Video AI — Sora, Veo, Runway, Kling”?

Keep going

Video AI — Sora, Veo, Runway, Kling

The video moment

Prompting video

Image-to-video and ref-image workflows

What breaks

Ethical considerations

Curious about “Video AI — Sora, Veo, Runway, Kling”?

Keep going