Tendril

Lesson 52 of 1455

Video AI — Sora, Veo, Runway, Kling

Text-to-video became practical in 2025 and cinematic in 2026. Here's the state of the art and how to choose.

Builders · Creative AI · ~19 min read

The video moment

Video generation went from 'jittery 4-second clips' in early 2024 to 'broadcast-grade 4K with synchronized dialogue and music' in early 2026. Four models lead. They're genuinely useful for pre-visualization, ads, b-roll, and short films — though full Hollywood-grade filmmaking is still human + AI, not AI alone.

Compare the options

Model	Best for	Max length / resolution	Audio?
OpenAI Sora 2	Cinematic physics, multi-subject scenes.	~20s, 1080p (upscalable).	Synced audio + dialogue.
Google Veo 3.1	Photorealism, audio quality, character dialog.	~60s, 1080p.	Best-in-class synced audio.
Runway Gen-4.5	Character consistency across scenes; pro editing.	~10s per shot; stitch in Runway.	Synced audio.
Kuaishou Kling 3.0	Native 4K / 60fps, longest clips (5 min), human motion.	5 min, 4K.	Synced audio.
Luma Dream Machine / Pika 2	Fast iteration, social-media clips, affordable.	~10s, 1080p.	Some models, newer.

Prompting video

Video prompts have two extra slots beyond image prompts: motion and camera.

A video prompt that specifies subject, setting, action, camera move, style, lighting, and duration.

text

A chef in a crowded Tokyo ramen shop gently ladles broth into a bowl. Steam rises. Camera slowly dollies in on her hands, then tilts up to her focused face. Shot on 35mm, shallow depth of field, warm practical lighting from paper lanterns. 8 seconds.

Image-to-video and ref-image workflows

Image-to-video: generate a still in Midjourney/Flux you love, then animate it in Runway or Kling.
Reference character: upload 3-5 images of a character to keep them consistent across shots (Runway Gen-4, Kling).
Keyframe: specify first and last frame; the model fills the motion between (Luma, Runway).

What breaks

Complex hand interactions (holding, typing) still glitch.
Long narratives — characters drift over 10+ seconds without explicit reference.
Physics in unusual scenarios (zero-g, underwater with many objects).
Fine text on signs/screens — still garbled most of the time.

Ethical considerations

Video deepfakes of real people are a serious concern. All major providers (OpenAI, Google, Runway, Kuaishou) refuse to generate named public figures without their explicit opt-in, and they watermark outputs (C2PA + SynthID for Google). If you're shipping a product, disclose AI origin and respect the TAKE IT DOWN Act (US) and EU AI Act labeling.

Key terms in this lesson

End-of-lesson quiz

Check what stuck

8 questions · Score saves to your progress.

Lesson help

Questions are best handled with a grown-up here.

For this age range, Tendril keeps freeform AI chat paused until parent/guardian consent and child-safe moderation are fully verified. Use the quiz, notes, and related lessons below, or ask a parent, guardian, teacher, or librarian to work through the question with you.

Progress saved locally in this browser. Sign in to sync across devices.

Related lessons

Video AI — Sora, Veo, Runway, Kling

The video moment

Prompting video

Image-to-video and ref-image workflows

What breaks

Ethical considerations

Questions are best handled with a grown-up here.

Keep going

Video AI — Sora, Veo, Runway, Kling

The video moment

Prompting video

Image-to-video and ref-image workflows

What breaks

Ethical considerations

Questions are best handled with a grown-up here.

Keep going