Consistency Distillation for Novel View Synthesis (CVS)
Priority: ⭐ High (Tier 1)
Documentation: 03_consistency_distillation.md
Overview
Distill multi-step diffusion models (Zero123, etc.) into single-step consistency models. Achieves 50× speedup over standard diffusion while maintaining quality.
Key Innovation
Standard Diffusion: z_T → z_{T-1} → ... → z_1 → z_0 (50 steps)
Consistency Model: z_t → z_0 (1 step)
Learn to map ANY point on the diffusion trajectory directly to the clean image.
Implementation Phases
Phase 1: Setup (1-2 days)
Phase 2: Distillation (3-4 days)
Phase 3: Optimization (2-3 days)
Phase 4: Integration (ongoing)
Technical Details
Model Size: ~200M params (vs 860M for SD U-Net)
Inference: ~110 GFLOPS per view (~3ms on RX 7800 XT)
Speedup: 50× over Zero123
Key Equations
Consistency function: f_θ: (x_t, t, condition) → x_0
Consistency loss: L_CD = d(f_θ(x_t, t), f_{θ⁻}(x_{t-Δt}, t-Δt))
EMA update: θ⁻ ← μ·θ⁻ + (1-μ)·θ
References
- Song et al. (2023): Consistency Models
- Liu et al. (2023): Zero-1-to-3
- Luo et al. (2023): Latent Consistency Models
Parent: #6
Consistency Distillation for Novel View Synthesis (CVS)
Priority: ⭐ High (Tier 1)
Documentation: 03_consistency_distillation.md
Overview
Distill multi-step diffusion models (Zero123, etc.) into single-step consistency models. Achieves 50× speedup over standard diffusion while maintaining quality.
Key Innovation
Learn to map ANY point on the diffusion trajectory directly to the clean image.
Implementation Phases
Phase 1: Setup (1-2 days)
Phase 2: Distillation (3-4 days)
Phase 3: Optimization (2-3 days)
Phase 4: Integration (ongoing)
Technical Details
Model Size: ~200M params (vs 860M for SD U-Net)
Inference: ~110 GFLOPS per view (~3ms on RX 7800 XT)
Speedup: 50× over Zero123
Key Equations
References
Parent: #6