By attribute · 9 models · 105 documented failure modes

Which AI video model keeps a consistent face across cuts?

Runway Gen-4 Scenes mode is the only documented model built specifically to hold a character across multiple cuts; others drift after a few cuts.

Last updated June 16, 2026 · Methodology: documented-failure-mode catalogue, not invented scores

Short answer

Runway Gen-4 (Scenes mode) is the only documented AI video model built to hold the same character across multiple cuts; other models visibly drift after a few cuts. For multi-shot work with a recurring face it has the strongest documented identity track record — though no model is reliable on extreme close-ups.

Identity holding is the consistency most people actually mean when they ask which model is “consistent.” Runway’s Scenes mode references a locked character, which is why its documented identity failures are narrower than its peers. The others can match a face within a single take but drift across cuts because each clip is a fresh sample. If your project is one continuous shot, the gap narrows; if it spans cuts with the same person, Runway is the documented pick.

See the documented evidence: Runway Gen-4 profile, the full failure catalogue, or the overall consistency ranking.

Full context

Documented failure profile, every model

ModelDocumented modesHolds best onDocumented weak spot
VeoGoogle Veo 313native audio, single-shot photoreal, lightinglong-prompt instruction drop, camera-motion-ignored on locked-off shots
RunwayRunway Gen-413character identity across cuts (Scenes mode)hand anatomy on close-ups, prompt-ignored on dense prompts
SoraOpenAI Sora 212stylized motion (historically)camera-control failures, multi-character interaction
SeedanceByteDance Seedance12short stylized clipsstyle-preset drift, motion drift over long clips
LumaLuma Dream Machine Ray-212lighting realism, atmospheric single takesidentity drift past ~3 cuts, camera-path drift
ViduVidu11reference-to-video character carrymotion plausibility, color drift
PikaPika 2.011stylized short-form, the closest Sora-style substituteface distortion on long clips, motion failures
KlingKling 1.611human motion on simple single-subject shotsmotion-blur overload, prompt adherence on complex scenes
HailuoHailuo MiniMax10expressive faces on close-upscamera-shake artifacts, physics collapse

Which model holds…

Pick by the thing that has to stay consistent

Score your prompt against each model’s documented weak spots.

AVA checks your prompt against the failure profile of each model before you spend a credit, and keeps your per-model hit-rate history. Pre-register for a 30% lifetime launch discount.

One email when we launch + maybe one followup. No marketing spam, ever. Unsubscribe one-click.