Lesson 52 of 1455
Video AI — Sora, Veo, Runway, Kling
Text-to-video became practical in 2025 and cinematic in 2026. Here's the state of the art and how to choose.
Builders · Creative AI · ~19 min read
The video moment
Video generation went from 'jittery 4-second clips' in early 2024 to 'broadcast-grade 4K with synchronized dialogue and music' in early 2026. Four models lead. They're genuinely useful for pre-visualization, ads, b-roll, and short films — though full Hollywood-grade filmmaking is still human + AI, not AI alone.
Compare the options
| Model | Best for | Max length / resolution | Audio? |
|---|---|---|---|
| OpenAI Sora 2 | Cinematic physics, multi-subject scenes. | ~20s, 1080p (upscalable). | Synced audio + dialogue. |
| Google Veo 3.1 | Photorealism, audio quality, character dialog. | ~60s, 1080p. | Best-in-class synced audio. |
| Runway Gen-4.5 | Character consistency across scenes; pro editing. | ~10s per shot; stitch in Runway. | Synced audio. |
| Kuaishou Kling 3.0 | Native 4K / 60fps, longest clips (5 min), human motion. | 5 min, 4K. | Synced audio. |
| Luma Dream Machine / Pika 2 | Fast iteration, social-media clips, affordable. | ~10s, 1080p. | Some models, newer. |
Prompting video
Video prompts have two extra slots beyond image prompts: motion and camera.
A video prompt that specifies subject, setting, action, camera move, style, lighting, and duration.
A chef in a crowded Tokyo ramen shop gently ladles broth into a bowl. Steam rises. Camera slowly dollies in on her hands, then tilts up to her focused face. Shot on 35mm, shallow depth of field, warm practical lighting from paper lanterns. 8 seconds.Image-to-video and ref-image workflows
- Image-to-video: generate a still in Midjourney/Flux you love, then animate it in Runway or Kling.
- Reference character: upload 3-5 images of a character to keep them consistent across shots (Runway Gen-4, Kling).
- Keyframe: specify first and last frame; the model fills the motion between (Luma, Runway).
What breaks
- Complex hand interactions (holding, typing) still glitch.
- Long narratives — characters drift over 10+ seconds without explicit reference.
- Physics in unusual scenarios (zero-g, underwater with many objects).
- Fine text on signs/screens — still garbled most of the time.
Ethical considerations
Video deepfakes of real people are a serious concern. All major providers (OpenAI, Google, Runway, Kuaishou) refuse to generate named public figures without their explicit opt-in, and they watermark outputs (C2PA + SynthID for Google). If you're shipping a product, disclose AI origin and respect the TAKE IT DOWN Act (US) and EU AI Act labeling.
Key terms in this lesson
End-of-lesson quiz
Check what stuck
8 questions · Score saves to your progress.
Lesson help
Questions are best handled with a grown-up here.
For this age range, Tendril keeps freeform AI chat paused until parent/guardian consent and child-safe moderation are fully verified. Use the quiz, notes, and related lessons below, or ask a parent, guardian, teacher, or librarian to work through the question with you.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 40 min
Builder Capstone: Ship a Short Creative Piece
Your first end-to-end AI-assisted creative project. Plan it, make it, and reflect on what surprised you. Small scope, real output.
Builders · 26 min
DALL-E vs. Midjourney vs. Flux
Five image models, five personalities. Here's when each one is the right pick — in 2026, with current strengths, costs, and quirks.
Builders · 30 min
The Craft of Image Prompting
Great image prompters aren't typing harder — they're using a mental framework. Subject, setting, style, composition, lighting, mood. Here's the system.
