Beyond Scores: Proximal Diffusion Models

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Diffusion models rely on score functions—the gradients of log-densities—for sampling, but score estimation is challenging and typically requires many sampling steps. This work proposes Proximal Diffusion Models (ProxDM), which replace the score function with the proximal mapping of the log-density and construct the generative process via backward discretization of stochastic differential equations. Theoretically, ProxDM achieves $varepsilon$-distributional accuracy in $widetilde{O}(d/sqrt{varepsilon})$ steps—significantly improving upon the $widetilde{O}(1/varepsilon^2)$ complexity of standard score-based methods. To learn the proximal operator, we introduce a proximal matching objective. Empirical results demonstrate that two ProxDM variants surpass state-of-the-art score-based methods within as few as 4–8 sampling steps, exhibiting both faster convergence and higher sampling efficiency.

Technology Category

Application Category

📝 Abstract

Diffusion models have quickly become some of the most popular and powerful generative models for high-dimensional data. The key insight that enabled their development was the realization that access to the score -- the gradient of the log-density at different noise levels -- allows for sampling from data distributions by solving a reverse-time stochastic differential equation (SDE) via forward discretization, and that popular denoisers allow for unbiased estimators of this score. In this paper, we demonstrate that an alternative, backward discretization of these SDEs, using proximal maps in place of the score, leads to theoretical and practical benefits. We leverage recent results in proximal matching to learn proximal operators of the log-density and, with them, develop Proximal Diffusion Models (ProxDM). Theoretically, we prove that $widetilde{O}(d/sqrt{varepsilon})$ steps suffice for the resulting discretization to generate an $varepsilon$-accurate distribution w.r.t. the KL divergence. Empirically, we show that two variants of ProxDM achieve significantly faster convergence within just a few sampling steps compared to conventional score-matching methods.

Problem

Research questions and friction points this paper is trying to address.

Proposes proximal diffusion models for faster convergence

Replaces score with proximal maps in SDE discretization

Achieves accurate distribution with fewer sampling steps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Backward SDE discretization with proximal maps

Proximal operators for log-density learning

Faster convergence than score-matching methods

🔎 Similar Papers

Operator-informed score matching for Markov diffusion models