Skip to content

[CVS] Consistency Distillation for Novel View Synthesis #8

@CalebisGross

Description

@CalebisGross

Consistency Distillation for Novel View Synthesis (CVS)

Priority: ⭐ High (Tier 1)
Documentation: 03_consistency_distillation.md

Overview

Distill multi-step diffusion models (Zero123, etc.) into single-step consistency models. Achieves 50× speedup over standard diffusion while maintaining quality.

Key Innovation

Standard Diffusion:  z_T → z_{T-1} → ... → z_1 → z_0  (50 steps)
Consistency Model:   z_t → z_0                        (1 step)

Learn to map ANY point on the diffusion trajectory directly to the clean image.

Implementation Phases

Phase 1: Setup (1-2 days)

  • Adapt Zero123 architecture for consistency training
  • Implement consistency loss (LPIPS + L2)
  • Setup training data pipeline (Objaverse)

Phase 2: Distillation (3-4 days)

  • Progressive distillation schedule (1024→256→64→16→4→1 steps)
  • EMA target network
  • Validate one-step generation quality

Phase 3: Optimization (2-3 days)

  • Latent space consistency (LCM-style)
  • Model compression (pruning, quantization)
  • Batch inference optimization

Phase 4: Integration (ongoing)

  • Multi-view generation pipeline
  • Integration with 3DGS optimization
  • Real-time demo

Technical Details

Model Size: ~200M params (vs 860M for SD U-Net)
Inference: ~110 GFLOPS per view (~3ms on RX 7800 XT)
Speedup: 50× over Zero123

Key Equations

Consistency function: f_θ: (x_t, t, condition) → x_0
Consistency loss: L_CD = d(f_θ(x_t, t), f_{θ⁻}(x_{t-Δt}, t-Δt))
EMA update: θ⁻ ← μ·θ⁻ + (1-μ)·θ

References

  • Song et al. (2023): Consistency Models
  • Liu et al. (2023): Zero-1-to-3
  • Luo et al. (2023): Latent Consistency Models

Parent: #6

Metadata

Metadata

Assignees

No one assigned

    Labels

    CVSConsistency DistillationresearchResearch and experimental work

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions