Kling V3 Omni — Multi-reference video — up to 4 images + 1 video + audio output in one model.

Kuaishou (Kling AI) · From 40 credits / 5s on Dahab Studio.

Kling V3 Omni is Kuaishou's flagship multi-reference video model. Unlike single-image text-to-video models, Omni accepts up to 4 reference images (for character consistency, style transfer, scene composition) plus 1 reference video (for camera movement or visual style). Output is 1080p with optional native audio. Dahab Studio routes this directly to Kling's Singapore API.

Specs

  • Max duration: 15s
  • Resolution: 1080p (pro mode)
  • Aspect ratios: 16:9, 9:16, 1:1
  • Native audio: Yes
  • Multi-reference: Yes
  • Pricing: 40 cr / 5s, 79 cr / 15s

Use cases

  • Character-consistent storytelling: Pass 2–3 reference images of the same character (front, side, close-up) so the generated video keeps consistent face, body, and clothing.
  • Virtual try-on / fashion: Reference image of a model + reference image of a garment → generated video of the model wearing it.
  • Style transfer: Reference video sets the camera movement or visual aesthetic; reference image sets the subject. Output blends both.
  • Multi-shot brand videos: Up to 6 storyboard shots in one generation, each with its own prompt and duration, totaling ≤15 seconds.

Kling V3 Omni vs alternatives

  • vs Bytedance Seedance 2.0: Same multi-reference capability. Kling Omni delivers stronger character consistency on faces; Seedance is sharper on product shots.
  • vs Runway Gen-3 Alpha: Direct API access (no Runway-hosted-only). Kling Omni accepts more reference images (4 vs 1 for Runway).
  • vs Single-image i2v models: Multi-reference makes Omni the right pick when consistency matters across multiple subjects or shots.

Frequently asked questions

How does Kling V3 Omni differ from regular image-to-video?
Standard i2v takes one image as the start frame. Omni takes multiple images (up to 4) as conceptual references — they don't lock the first frame, they guide what appears in the video. Plus you can add a reference video for style/camera matching.
Does Kling Omni generate audio?
Yes, in pro mode (1080p). Native synchronised audio is enabled by default unless a reference video is supplied (those two are mutually exclusive in Kling's API).
How do I reference specific images in my prompt?
Use <<<image_1>>> and <<<image_2>>> tokens in the prompt to point at the corresponding entries in your reference list. Example: "The person from <<<image_1>>> wearing the jacket from <<<image_2>>>."
Is Kling V3 Omni cheaper than Replicate?
Yes — Dahab Studio routes directly to Kling's Singapore API instead of through Replicate's wrapper. The savings are ~$1.26 per 10s video, kept in our margin so user-facing pricing remains predictable.
What aspect ratios are supported?
16:9, 9:16, and 1:1. Kling V3 Omni does not support 4:3 or 3:4 — those route to a fallback ratio.

Related models

  • Bytedance Seedance 2.0 — Universal-reference video — up to 4 images + 1 video + 1 audio file.
  • Kling V2.6 — Kling's latest text-to-video and image-to-video at 1080p with audio.
  • Grok Imagine Video — xAI's native-audio video model — fast, cheap, and 1080p out of the box.

Generate with Kling V3 Omni →

← All AI video models on Dahab Studio