Wan 2.2 Core Capabilities
Based on Alibaba's advanced Wan 2.2 artificial intelligence technology, providing comprehensive creative solutions
Text to Image
Using Wan 2.2, input text descriptions and AI can generate high-quality image works, supporting various artistic styles and thematic creations. Through Wan 2.2's advanced diffusion model, it can understand complex semantic information and generate beautiful images that match the description.
Image to Video
Wan 2.2 transforms static images into dynamic videos, adding vivid visual effects to your creations. Based on Wan 2.2's powerful spatiotemporal consistency algorithm, ensuring generated videos are natural and smooth while maintaining core image features.
Style Transfer
Wan 2.2 one-click style conversion, from photos to hand-drawn, from realistic to cartoon, meeting different creative needs. Wan 2.2 supports various artistic style transfers, giving your works unique visual effects.
Open Source
We are excited to introduce Wan2.2, a major upgrade to our visual generative models, which is now open-sourced, offering more powerful capabilities, better performance, and superior visual quality.
ARCHITECTURE
MoE Architecture
Wan2.2 introduces a Mixture-of-Experts (MoE) architecture into video diffusion models. By separating the denoising process cross timesteps with specialized powerful expert models, this enlarges the overall model capacity while maintaining the same computational cost.
SCALING
Data Scaling
Compared to Wan2.1, Wan2.2 is trained on significantly larger data, with +65.6% more images and +83.2% more videos. This expansion notably enhances the model's generalization across multiple dimensions such as motions, semantics, and aesthetics.
AESTHETICS
Cinematic Aesthetics
Wan2.2 incorporates specially curated aesthetic data with fine-grained labels for lighting, composition, and color. This allows for more precise and controllable cinematic style generation.
EFFICIENCY
Efficient High-Definition Hybrid TI2V
Wan2.2 open-sources a 5B model built with our advanced Wan2.2-VAE that achieves a compression ratio of 16×16×4. This model supports both text-to-video and image-to-video generation at 720P resolution with 24fps.
Available Models
Three specialized models designed for different use cases and performance requirements
Wan2.2-T2V-A14B
The T2V-A14B model supports generating 5s videos at both 480P and 720P resolutions. Built with a Mixture-of-Experts (MoE) architecture, it delivers outstanding video generation quality.

Wan2.2-I2V-A14B
The I2V-A14B model, designed for image-to-video generation, supports both 480P and 720P resolutions. Built with a Mixture-of-Experts (MoE) architecture, it achieves more stable video synthesis.
Wan2.2-TI2V-5B
The TI2V-5B model supports both text-to-video and image-to-video generation at 720P resolution with 24fps and can run on single consumer-grade GPU such as the 4090.
Wan 2.2 Frequently Asked Questions
Learn about Wan 2.2's core features, technical characteristics, and application scenarios
Basic Features
Understanding Wan 2.2's core functions and basic concepts
Technical Specifications
Deep understanding of Wan 2.2's technical details and hardware requirements
Application Scenarios
Explore Wan 2.2's practical applications in different fields