Advanced physics simulation for realistic complex movements. Supports end frame for loop creation.
Seedance 1.0 Pro Fast3-12 /s
Cheaper, faster variant of Seedance 1.0 Pro via BytePlus. ~50% cost savings vs the standard 1.0 Pro at the same resolutions.
Seedance 1.5 Pro3-12 /s
ByteDance Seedance 1.5 Pro via BytePlus. High-precision audio-visual sync, cinematic motion, and emotional expression. Supports first/last frames.
Seedance 1.0 Lite I2V4-14 /s
ByteDance Seedance 1.0 Lite Image-to-Video — efficient, cost-effective animation of source images. Direct via BytePlus.
Dreamina Seedance 2.0 Fast5-20 /s
Faster, cheaper sibling of Seedance 2.0 via BytePlus. Same multimodal capability set (text + image + video + audio references, native sync audio, multi-shot narration) at roughly 40% lower per-second cost. Trades a bit of fidelity for speed — ideal for iteration loops.
Seedance 1.0 Pro6-30 /s
ByteDance Seedance 1.0 Pro via BytePlus. Comprehensive and powerful video generation with strong motion control.
Dreamina Seedance 2.08-30 /s
ByteDance flagship multimodal video model via BytePlus — accepts text + reference images + reference video + reference audio. Native synchronized audio, pro camera controls, multi-shot narration. Cheaper than the Replicate route and exposes capabilities Replicate hides.
P-Video8 /s
Pruna P-Video — fast, affordable text-to-video at $0.02/sec. Standard aspect ratios, 3-15 second clips. Strong bang-for-buck for prototypes and short-form content.
Grok Imagine Video10 /s
xAI Imagine API video — fast text-to-video and image-to-video at a flat $0.05/sec. Async polling pattern; supports 720p, 1-15 second durations. Native audio is included on every generation (cannot be turned off).
Wan 2.6 I2V Flash10-15 /s
Fast image-to-video with optional audio sync. Faster inference than standard Wan 2.6 I2V. Up to 15 seconds.
Happy Horse 1.0 I2V14 /s
Direct-to-DashScope Happy Horse 1.0 image-to-video. Strict consistency with the source image, fluent natural motion, native audio. 3-15 seconds.
Happy Horse 1.0 T2V14 /s
Direct-to-DashScope Happy Horse 1.0 text-to-video. Cheaper than the Replicate path. 3-15 second durations, five aspect ratios, native audio always on.
Happy Horse 1.0 R2V20 /s
Reference-to-video — combines up to 9 reference images for strong subject + scene consistency. Direct-to-DashScope only (Replicate proxy hides this).
Veo 3.1 Fast20 /s
Google's Veo 3.1 Fast with native audio and frame-to-frame generation. Supports start and end frames for seamless transitions.
Wan 2.6 I2V20-30 /s
Alibaba Wan 2.6 image-to-video with multi-shot storytelling, native audio, and precise lip-sync. Up to 15 seconds.
Wan 2.6 T2V20-30 /s
Alibaba's latest text-to-video model with multi-shot storytelling, native audio, and precise lip-sync. Up to 15 seconds.
Happy Horse 1.0 Video Edit24 /s
Local or global edits to an existing video using natural-language instructions and up to 5 reference images. Preserves original motion. Direct-to-DashScope only.
Kling v334-45 /s
Kuaishou Kling v3 — cinematic text-to-video and image-to-video up to 15 seconds with native audio and lip-synced dialogue. Supports start and end frames. Standard mode = 720p, Pro mode = 1080p.
Veo 3.140 /s
Google's flagship video model with strongest prompt adherence and cinematic motion. Synchronized native audio, reference images, and start+end frame control.
Kling v2.1 Master56 /s
Premium Kling model with enhanced quality and longer durations.
Pruna P-Image (direct) — sub-second text-to-image built for production. Strong prompt adherence and crisp text rendering at standard aspect ratios.阅读更多 →
Z-Image Turbo1
Login required to use this model. Alibaba Z-Image Turbo — lightweight text-to-image with bilingual (Chinese+English) text rendering. Fast, cheap, flexible resolutions from 512×512 up to 2048×2048.阅读更多 →
Flux Schnell2
Ultra-fast image generation at a fraction of the cost. Perfect for quick iterations and testing ideas.阅读更多 →
P-Image Edit2
Pruna P-Image Edit — focused image editing with 1-5 reference images, sub-second inference, and exact-prompt adherence. Strict image-to-image (requires at least one source image). 256-1440 px output in standard aspect ratios.阅读更多 →
Wan 2.6 Image2
Login required to use this model. Alibaba Wan 2.6 text-to-image — strong cinematic photography and artistic styles. Renders at 1280×1280 to 1440×1440 with five aspect ratios.阅读更多 →
Grok Imagine4
Login required to use this model. xAI's Imagine API image model — fast generation and editing with strong prompt fidelity.阅读更多 →
Qwen Image4
Login required to use this model. Alibaba Qwen Image — flagship text-to-image with strong text rendering, multilingual prompt support, and crisp realism.阅读更多 →
Wan 2.7 Image4
Login required to use this model. Alibaba Wan 2.7 image — text-to-image, image editing with up to 9 reference images, and multi-image reference generation. Up to 2K resolution. Was previously via Replicate at ~30× the cost.阅读更多 →
SeedEdit 3.06
Login required to use this model. BytePlus SeedEdit 3.0 — focused image-editing model. Takes a source image plus an instruction and applies the edit in-place ("make the bubbles heart-shaped", "change the sky to sunset"). Adaptive output sizing follows the input image dimensions. Strict image-to-image only (no pure text-to-image).阅读更多 →
Seedream 4.06
Login required to use this model. BytePlus Seedream 4.0 — multimodal image creation with text + single-image + multi-image inputs. Multi-image blending up to 10 references, sequential image-set generation up to 15 outputs, 4K ultra-HD support. Direct via BytePlus.阅读更多 →
Seedream 5.0 Lite7
Login required to use this model. BytePlus Dola-Seedream 5.0 Lite — flagship image model with web-connected retrieval, multi-image fusion, image-set generation, and strong consistency preservation. Supports text-to-image and image-to-image (with up to 4 reference images). Up to 2K output. Direct via BytePlus.阅读更多 →
Qwen Image Edit8
Login required to use this model. Alibaba Qwen Image Edit — image-to-image with multi-reference editing, text-in-image rendering, and example-based composition.阅读更多 →
Seedream 4.58
Login required to use this model. BytePlus Seedream 4.5 — refined image model with strong editing consistency, multi-image fusion, finer detail control, natural small-text and face rendering, and 2560×1440 to 4096×4096 output. Supports text-to-image and image-to-image with up to 4 reference images.阅读更多 →
Wan 2.7 Image Pro8
Login required to use this model. Alibaba Wan 2.7 image Pro — premium variant with 4K text-to-image, thinking mode for higher quality, image editing, and multi-image reference.阅读更多 →
Seedream 4.09
Login required to use this model. Next-generation image model with unified generation & editing. Supports sequential image generation for coherence and multi-reference workflows.阅读更多 →
Grok Imagine Quality10
Login required to use this model. xAI's higher-quality Imagine image model — sharper detail and stronger composition than the standard Grok Imagine, at a higher per-image cost.阅读更多 →
Gemini 2.5 Flash Image12
Login required to use this model. Google's fast image model with good quality and speed. Excellent at following natural language prompts. ⚠️ This model only supports Square (1:1) and Match Input aspect ratios.阅读更多 →
Seedream 4.512
Login required to use this model. Upgraded Bytedance image model with stronger spatial understanding and world knowledge.阅读更多 →
Nano Banana 214
Login required to use this model. Google's latest fast image model with high-efficiency production-scale visual creation. Supports multiple resolutions up to 4K.阅读更多 →
Nano Banana Pro27
Login required to use this model. Google's professional design engine with a reasoning core for studio-quality 4K visuals, complex layouts, and precise text rendering.阅读更多 →