By attribute · 9 models · 105 documented failure modes
Which AI video model has the best cinematic lighting?
Luma Ray-2 documents the fewest lighting-related failures and leads on photoreal cinematic light for mood-led single-shot work.
Short answer
Luma Ray-2 documents the fewest lighting-related failures and leads on photoreal cinematic lighting for mood-led, single-shot work. For lighting-critical single takes it is the strongest documented option. Other models are stronger on identity (Runway) or native audio (Veo) but cluster more lighting issues.
Lighting is where Luma’s documented profile is cleanest — it holds photoreal light direction and mood within a single take better than peers whose failures cluster elsewhere. The caveat is the same as everywhere: this is about a single take. Across cuts, lighting can shift because each clip re-samples. For mood-led single-shot work, Luma is the documented lead; for multi-cut continuity you are back to the identity question.
See the documented evidence: Luma Ray-2 profile, the full failure catalogue, or the overall consistency ranking.
Full context
Documented failure profile, every model
| Model | Documented modes | Holds best on | Documented weak spot |
|---|---|---|---|
| VeoGoogle Veo 3 | 13 | native audio, single-shot photoreal, lighting | long-prompt instruction drop, camera-motion-ignored on locked-off shots |
| RunwayRunway Gen-4 | 13 | character identity across cuts (Scenes mode) | hand anatomy on close-ups, prompt-ignored on dense prompts |
| SoraOpenAI Sora 2 | 12 | stylized motion (historically) | camera-control failures, multi-character interaction |
| SeedanceByteDance Seedance | 12 | short stylized clips | style-preset drift, motion drift over long clips |
| LumaLuma Dream Machine Ray-2 | 12 | lighting realism, atmospheric single takes | identity drift past ~3 cuts, camera-path drift |
| ViduVidu | 11 | reference-to-video character carry | motion plausibility, color drift |
| PikaPika 2.0 | 11 | stylized short-form, the closest Sora-style substitute | face distortion on long clips, motion failures |
| KlingKling 1.6 | 11 | human motion on simple single-subject shots | motion-blur overload, prompt adherence on complex scenes |
| HailuoHailuo MiniMax | 10 | expressive faces on close-ups | camera-shake artifacts, physics collapse |
Which model holds…
Pick by the thing that has to stay consistent
Holds a consistent face across cuts
Runway Gen-4
Runway Gen-4 Scenes mode is the only documented model built specifically to hold a character across multiple cuts; others drift after a few cuts.
Holds readable on-screen text
No model reliably does
Text rendering is a documented failure mode for every covered model — all garble past roughly six characters. Add text in post instead of relying on the model.
Holds correct hands in close-up
No model reliably does
Hand-anatomy failure is documented across every model. Frame hands away from camera or expect to re-roll; no model has solved close-up finger topology.
Holds native audio with the video
Veo (Google Veo 3)
Veo is the only covered model with native audio generation; the rest produce silent video that needs separate audio.
Holds long, multi-instruction prompts
No model reliably does
Every model documents an instruction-drop / prompt-adherence failure that worsens as prompt length grows. Front-load must-haves and keep prompts short.