A Joint Project by Zhejiang University & vivo AI
MagicTryOn: The New Standard in Video Virtual Try-On
Effortlessly dress any person in any video with any piece of clothing. Experience unparalleled realism, temporal consistency, and detail clarity with tmagictryon.
Solving the Three Core Challenges of Virtual Try-On
Traditional methods fail where MagicTryOn excels.
Garment Deformation & Glitches
Weak U-Net models struggle with details, causing clothes to warp, disappear, or render incorrectly.
Motion Inconsistency
Poor temporal modeling means garments jitter, lag, or misalign during rapid movements like dancing.
Lack of Realism
Failure to reproduce textures, outlines, and lighting results in a video that looks artificial and unconvincing.
The MagicTryOn Advantage
Where true creative freedom meets cutting-edge technology.
Universal Compatibility
Use any person's video with any garment image. No need for specific body models, templates, or pose libraries. Total flexibility.
Robust in High-Motion Video
Even with intense movements like dancing or turning, MagicTryOn ensures the garment remains stable, coherent, and perfectly tracked.
Photorealistic Detail & Clarity
Renders textures (lace, prints), contours (seams, collars), and structure with such precision that the output looks like a real-world video recording.
Core Technology Highlights
A brief look at the innovations powering tmagictryon.
Diffusion Transformer (DiT) Backbone
MagicTryOn replaces the traditional U-Net with a powerful Diffusion Transformer. This allows it to model long-range dependencies across video frames, ensuring spatiotemporal consistency.
The result: clothes that move naturally with the person, without jitter, lag, or artifacts, while the diffusion process guarantees high-fidelity, detailed image generation.
# How it works:
1. Extract pose & body shape from video.
2. Extract multi-level features from garment image.
3. Feed video data + noise into DiT model.
4. DiT performs frame-by-frame diffusion generation.
5. Output a new video of the person wearing the garment.
A Two-Stage Control Strategy:
- 1. Coarse Guidance: A "garment token" provides a strong initial signal about what clothes to wear.
- 2. Fine-Grained Conditioning: CLIP features, texture maps, and outlines are injected to define material, color, and fit.
Coarse-to-Fine Garment Preservation
To ensure the garment's structure is perfectly preserved, MagicTryOn uses a dual control strategy. This tells the model not only *what* to wear, but exactly *how* it should look and behave, from overall shape down to the finest wrinkle.
Additionally, a unique **mask-aware loss function** focuses the training process exclusively on the garment area, preventing the model from being distracted and dramatically improving clothing fidelity.
Get Started with MagicTryOn
Bring MagicTryOn to your own projects. Follow these steps to set up the environment.
1. Create Conda Environment & Install Requirements
conda create -n magictryon python==3.10
conda activate magictryon
pip install -r requirements.txt
2. Download Model Weights
cd Magic-TryOn
HF_ENDPOINT=https://hf-mirror.com huggingface-cli download LuckyLiGY/MagicTryOn --local-dir ./weights/MagicTryOn_14B_V1
3. Run Demo Inference
For image or video try-on, use the provided prediction scripts. Full instructions for custom try-on are available on GitHub.
# Example for Image Try-On
CUDA_VISIBLE_DEVICES=0 python predict_image_tryon_up.py
Project News
- June 9, 2025: Code & weights released on HuggingFace!
- May 27, 2025: Our technical paper is now available on arXiv.
Release Roadmap
- ✅ Source Code
- ✅ Inference Demo & Pretrained Weights
- ✅ Customized Try-On Utilities
- ⬜️ Testing & Training Scripts
- ⬜️ V2 Model Weights
- ⬜️ Gradio App Update