Kling V3 Omni — Multi-reference video — up to 4 images + 1 video + audio output in one model.
Kuaishou (Kling AI) · From 40 credits / 5s on Dahab Studio.
Kling V3 Omni is Kuaishou's flagship multi-reference video model. Unlike single-image text-to-video models, Omni accepts up to 4 reference images (for character consistency, style transfer, scene composition) plus 1 reference video (for camera movement or visual style). Output is 1080p with optional native audio. Dahab Studio routes this directly to Kling's Singapore API.
Specs
- Max duration: 15s
- Resolution: 1080p (pro mode)
- Aspect ratios: 16:9, 9:16, 1:1
- Native audio: Yes
- Multi-reference: Yes
- Pricing: 40 cr / 5s, 79 cr / 15s
Use cases
- Character-consistent storytelling: Pass 2–3 reference images of the same character (front, side, close-up) so the generated video keeps consistent face, body, and clothing.
- Virtual try-on / fashion: Reference image of a model + reference image of a garment → generated video of the model wearing it.
- Style transfer: Reference video sets the camera movement or visual aesthetic; reference image sets the subject. Output blends both.
- Multi-shot brand videos: Up to 6 storyboard shots in one generation, each with its own prompt and duration, totaling ≤15 seconds.
Kling V3 Omni vs alternatives
- vs Bytedance Seedance 2.0: Same multi-reference capability. Kling Omni delivers stronger character consistency on faces; Seedance is sharper on product shots.
- vs Runway Gen-3 Alpha: Direct API access (no Runway-hosted-only). Kling Omni accepts more reference images (4 vs 1 for Runway).
- vs Single-image i2v models: Multi-reference makes Omni the right pick when consistency matters across multiple subjects or shots.
Frequently asked questions
- How does Kling V3 Omni differ from regular image-to-video?
- Standard i2v takes one image as the start frame. Omni takes multiple images (up to 4) as conceptual references — they don't lock the first frame, they guide what appears in the video. Plus you can add a reference video for style/camera matching.
- Does Kling Omni generate audio?
- Yes, in pro mode (1080p). Native synchronised audio is enabled by default unless a reference video is supplied (those two are mutually exclusive in Kling's API).
- How do I reference specific images in my prompt?
- Use <<<image_1>>> and <<<image_2>>> tokens in the prompt to point at the corresponding entries in your reference list. Example: "The person from <<<image_1>>> wearing the jacket from <<<image_2>>>."
- Is Kling V3 Omni cheaper than Replicate?
- Yes — Dahab Studio routes directly to Kling's Singapore API instead of through Replicate's wrapper. The savings are ~$1.26 per 10s video, kept in our margin so user-facing pricing remains predictable.
- What aspect ratios are supported?
- 16:9, 9:16, and 1:1. Kling V3 Omni does not support 4:3 or 3:4 — those route to a fallback ratio.
Related models
- Bytedance Seedance 2.0 — Universal-reference video — up to 4 images + 1 video + 1 audio file.
- Kling V2.6 — Kling's latest text-to-video and image-to-video at 1080p with audio.
- Grok Imagine Video — xAI's native-audio video model — fast, cheap, and 1080p out of the box.
Generate with Kling V3 Omni →
← All AI video models on Dahab Studio