Noise Conditional Variational Score Distillation

📅 2025-06-11

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the low sampling efficiency of diffusion models by proposing a variational score distillation framework that distills a pre-trained diffusion model into a generative denoiser generalizable across the full noise schedule. Theoretically, it establishes for the first time the equivalence between the unconditional score function and the score of the denoised posterior distribution. Methodologically, it introduces noise-conditioned modeling and a generative denoiser architecture, enabling zero-shot probabilistic inference, single-step high-speed sampling, and multi-step quality-controllable generation. Experiments demonstrate that the method achieves state-of-the-art LPIPS scores—surpassing those of the teacher diffusion model—in both class-conditional image generation and inverse problems. It significantly reduces the number of function evaluations (NFE) while matching the performance of substantially larger consistency models.

Technology Category

Application Category

📝 Abstract

We propose Noise Conditional Variational Score Distillation (NCVSD), a novel method for distilling pretrained diffusion models into generative denoisers. We achieve this by revealing that the unconditional score function implicitly characterizes the score function of denoising posterior distributions. By integrating this insight into the Variational Score Distillation (VSD) framework, we enable scalable learning of generative denoisers capable of approximating samples from the denoising posterior distribution across a wide range of noise levels. The proposed generative denoisers exhibit desirable properties that allow fast generation while preserve the benefit of iterative refinement: (1) fast one-step generation through sampling from pure Gaussian noise at high noise levels; (2) improved sample quality by scaling the test-time compute with multi-step sampling; and (3) zero-shot probabilistic inference for flexible and controllable sampling. We evaluate NCVSD through extensive experiments, including class-conditional image generation and inverse problem solving. By scaling the test-time compute, our method outperforms teacher diffusion models and is on par with consistency models of larger sizes. Additionally, with significantly fewer NFEs than diffusion-based methods, we achieve record-breaking LPIPS on inverse problems.

Problem

Research questions and friction points this paper is trying to address.

Distilling diffusion models into generative denoisers

Learning scalable denoisers for various noise levels

Enabling fast generation with iterative refinement benefits

Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise Conditional Variational Score Distillation

Scalable learning of generative denoisers

Fast one-step and multi-step sampling

🔎 Similar Papers

No similar papers found.