Bytedance Seedance 2.0 — Universal-reference video — up to 4 images + 1 video + 1 audio file.

Bytedance · From 87 credits / 5s on Dahab Studio.

Bytedance Seedance 2.0 is a universal-reference video model — meaning it accepts any combination of up to 4 reference images, 1 reference video, and 1 reference audio file in a single generation. It's the closest thing to a "do-everything" video model on the market and Dahab Studio exposes it at $0.22/second (720p tier) and $0.55/second (1080p tier).

Specs

  • Max duration: 15s
  • Resolution: 720p or 1080p (separate tiers)
  • Aspect ratios: 16:9, 9:16, 1:1, 4:3, 3:4
  • Native audio: No
  • Multi-reference: Yes
  • Pricing: 87 cr / 5s, 174 cr / 15s

Use cases

  • Product placement videos: Reference image of the product + reference image of the scene → generated video with the product placed naturally in the scene.
  • Character consistency: Up to 4 reference images of the same person from different angles produces a video that locks identity across frames.
  • Style + voice combo: Reference video sets motion style; reference audio drives the soundtrack timing. Combine for music videos or rhythmic edits.
  • Premium 1080p ads: Use the 1080p tier when client deliverables need broadcast-grade resolution. Seedance 2.0 1080p 15s sells at 599 cr (promotional discount applied).

Seedance 2.0 vs alternatives

  • vs Kling V3 Omni: Seedance accepts a reference audio file in addition to images and video. Kling Omni is image+video only.
  • vs Runway Gen-3: More reference inputs (4 images vs Runway's 1). Runway is sharper on highly stylized prompts.
  • vs Veo 3.1: Universal reference — Seedance handles cases where you need both visual and audio references in one shot.

Frequently asked questions

What does "universal reference" mean?
It means Seedance 2.0 can take any mix of reference inputs in one call: images, a video, and an audio file. Most other models accept only one type at a time.
What's the difference between Seedance 2.0 and 2.0 1080p on Dahab?
720p tier is $0.22/s (87 cr/5s). 1080p tier is $0.55/s (217 cr/5s) but ships sharper output for client deliverables. The 1080p 15s combo has a promotional 599 cr price (~38% margin instead of 50%).
Does Seedance generate audio?
No native audio in the output. If you provide a reference audio file, the video timing aligns to it but the audio itself doesn't bake in unless you mux separately. Dahab's pipeline handles this for the URL-to-Ad and UGC flows.
How many reference images can I use?
Up to 4 reference images, plus optionally 1 reference video and 1 reference audio file in the same generation.
Is Seedance 2.0 available in Arabic?
Yes — prompts can be in Arabic and the model handles visual generation. For Arabic dialogue voiceover, Dahab automatically pipes through ElevenLabs Egyptian TTS.

Related models

  • Kling V3 Omni — Multi-reference video — up to 4 images + 1 video + audio output in one model.
  • Google Veo 3.1 — Google's premium Veo 3.1 tier — top-quality 1080p with native audio.
  • Grok Imagine Video — xAI's native-audio video model — fast, cheap, and 1080p out of the box.

Generate with Seedance 2.0 →

← All AI video models on Dahab Studio