Generate high-quality text-to-video and image-to-video content using Wan 2.2 14B on budget hardware!
🚀 Performance Highlight: Generate a 1-second video in under 5 minutes on an RTX 3050 6GB!
- Low VRAM Optimization - Works on 6GB VRAM cards (RTX 3050, 3060, AMD equivalents)
- User-Friendly Interface - Simple FPS & Seconds input (automatic frame calculation)
- Dual Mode Support - Text-to-Video (t2v) and Image-to-Video (i2v) with easy switching
- Last Frame Extraction - Optional last frame saving for video chaining/extension
- GGUF Quantization - Leverages quantized models for memory efficiency
- Smart Memory Management - Aggressive CPU offloading for low-VRAM systems
- Fast Generation - 4-step sampling for rapid results
- VRAM: 6GB (RTX 3050, 3060 6GB, AMD equivalents)
- System RAM: 16GB
- Storage: ~15GB for models
- VRAM: 8GB+
- System RAM: 32GB
- Storage: SSD for faster model loading
Install these ComfyUI custom nodes (via ComfyUI Manager or manual installation):
- rgthree-comfy - Context system, workflow organization, group muting
- cg-use-everywhere - Global parameter distribution
- ComfyUI-Easy-Use - LoRA loading and model management
- ComfyUI-VideoHelperSuite - Video processing and frame selection
- ComfyUI-KJNodes - Utility nodes
- ComfyUI-GGUF - GGUF model loading (CRITICAL for this workflow)
- comfyui_memory_cleanup - Memory management for low-VRAM
- ComfyUI-Custom-Scripts - Math expression nodes
- efficiency-nodes-comfyui - Workflow efficiency nodes
- Open ComfyUI Manager
- Click "Install Custom Nodes"
- Search for each node pack above
- Click "Install" on each
- Restart ComfyUI
cd ComfyUI/custom_nodes
git clone https://github.com/rgthree/rgthree-comfy
git clone https://github.com/chrisgoringe/cg-use-everywhere
git clone https://github.com/yolain/ComfyUI-Easy-Use
git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite
git clone https://github.com/kijai/ComfyUI-KJNodes
git clone https://github.com/city96/ComfyUI-GGUF
git clone https://github.com/LAOGOU-666/Comfyui-Memory_Cleanup
git clone https://github.com/pythongosssss/ComfyUI-Custom-Scripts
git clone https://github.com/jags111/efficiency-nodes-comfyui
# Restart ComfyUIThis workflow uses the wan2.2-t2v-rapid-aio-v10-nsfw GGUF model from Phr00t/befox rapid merge.
The "D" prefix you may see in some screenshots (e.g., Dwan2.2-t2v-rapid-aio-v10-nsfw-Q3_K.gguf) is just my local folder organization. Download files with their original names as shown below - ComfyUI will load them correctly from the standard model folders.
Repository: https://huggingface.co/befox/WAN2.2-14B-Rapid-AllInOne-GGUF/tree/main/v10
Recommended for 6-8GB VRAM: Q3_K or Q4_K
- Download: wan2.2-t2v-rapid-aio-v10-nsfw-Q3_K.gguf
- Placement:
ComfyUI/models/unetormodels/checkpoints
- Download: wan2.2-t2v-rapid-aio-v10-nsfw-Q4_K.gguf
- Placement:
ComfyUI/models/unetormodels/checkpoints
- Download: wan2.2-t2v-rapid-aio-v10-nsfw-Q5_K.gguf
- Placement:
ComfyUI/models/unetormodels/checkpoints
Repository: https://huggingface.co/city96/umt5-xxl-encoder-gguf/tree/main
Recommended: Q3_K_M (lowest VRAM usage, good performance)
- Download: umt5-xxl-encoder-Q3_K_M.gguf
- Placement:
ComfyUI/models/clipormodels/text_encoders
- Download: umt5-xxl-encoder-Q4_K_M.gguf
- Placement:
ComfyUI/models/clipormodels/text_encoders
Repository: https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/vae
- Download: wan_2.1_vae.safetensors
- Placement:
ComfyUI/models/vae
- Verify file hashes after download (Hugging Face shows them on each file page)
- If links change, search Hugging Face for "WAN2.2-14B-Rapid-AllInOne-GGUF" (befox), "umt5-xxl-encoder-gguf" (city96), or "Wan_2.2_ComfyUI_Repackaged" (Comfy-Org)
- For best low-VRAM performance: Use Q3_K or Q4_K model + Q3_K_M CLIP + standard Wan VAE
| VRAM | Main Model | Text Encoder | Total Memory | Performance |
|---|---|---|---|---|
| 6GB | Q3_K | Q3_K_M | ~11GB system | Fast, good quality |
| 8GB | Q4_K | Q3_K_M | ~13GB system | Balanced ⭐ |
| 10GB+ | Q5_K | Q4_K_M | ~15GB system | High quality |
- Load the workflow in ComfyUI
- Set your parameters:
- Width: 512 (default)
- Height: 512 (default)
- Seconds: 10 (duration of video)
- FPS: 15 (frames per second)
- Frames are calculated automatically!
- Choose mode:
- Text-to-Video (t2v): Generate from text prompt
- Image-to-Video (i2v): Animate from input image
- Enter your prompt (positive and negative)
- Queue prompt and generate!
- Make sure "Load Image" node is disabled/muted
- Enter your text prompt describing the video
- Queue prompt
- Enable the "Load Image" node
- Load your input image
- Enter your prompt (can enhance/modify the image animation)
- Queue prompt
The workflow includes automatic last frame saving for video chaining:
- Enabled by default - Last frame is automatically saved
- To disable: Mute the "Save the Last Frame" group using rgthree Group Muter
- Use case: Chain multiple generations together for longer videos
- Input: FPS × Seconds
- Calculation:
(FPS × Seconds) + 1 - The +1 frame is required for Wan's t2v initialization
- Works for both t2v and i2v modes automatically
- GGUF quantization reduces model size
- Aggressive CPU offloading (~9GB offloaded to system RAM)
- Dynamic memory management between pipeline stages
- Memory cleanup nodes for low-VRAM systems
- Clean interface with POS/NEG/Switch nodes
- One control point switches both two paths (Text2Image and Image2Video) simultaneously
- No manual rewiring needed between modes
- Drop to Q3_K model instead of Q4_K
- Close other applications to free system RAM
- Enable Windows page file (set to system managed)
- Reduce video length (fewer seconds)
- Install all required custom nodes (see list above)
- Use ComfyUI Manager's "Install Missing Nodes" feature
- Restart ComfyUI after installing nodes
- Ensure ComfyUI-GGUF is installed correctly
- Verify model files are in correct folders with exact filenames
- Check that files downloaded completely (verify file sizes)
- Try Q4_K or Q5_K model if you have more VRAM
- Increase steps (though 4 steps works well with this model)
- Adjust prompt strength and detail
- Use better quality input images for i2v mode
- This is normal for low-VRAM setups (CPU offloading takes time)
- Expected: ~5 minutes for 1 second video on RTX 3050 6GB
- Close background applications
- Ensure models are on SSD for faster loading
- Model: Q4_K + Q3_K_M CLIP
- Video length: 1 second (16 frames @ 15 FPS)
- Total time: ~4.6 minutes
- Model loading: ~1 min
- Sampling (4 steps): ~1.8 min
- VAE decode: ~30s
- Quality: Good, suitable for most use cases
- RTX 3060 12GB: ~3 minutes (less offloading overhead)
- RTX 4060 8GB: ~2.5 minutes (faster architecture)
- RTX 4070 12GB: ~2 minutes (minimal offloading)
Times are approximate and depend on system configuration
The workflow supports standard LoRA loading through the EasyUse nodes. You can add any compatible Wan 2.2 LoRAs to enhance or modify output.
This workflow is released into the public domain with no restrictions. Use it however you want - personal, commercial, modified, redistributed.
Contributions, improvements, and forks are welcome! No attribution required, but appreciated! ❤️
- befox - GGUF quantized Wan models
- city96 - ComfyUI-GGUF loader and umt5-xxl encoder
- Comfy-Org - Wan 2.2 repackaged models
- ComfyUI community - All the amazing custom node developers
This workflow is released into the public domain with no restrictions.
You are free to:
- Use commercially
- Modify and redistribute
- Use in closed-source projects
- Not provide attribution (though it's appreciated!)
No warranty provided. Use at your own risk.
If you find this workflow helpful:
- ⭐ Star this repository
- 🐛 Report issues or bugs
- 💡 Suggest improvements
- 📢 Share with the community!
