🤖 AI Summary
Existing DDPM sampling relies on exponential Euler discretization, suffering from convergence complexity at least O(d) or O(√I₀), severely limiting scalability in high dimensions.
Method: We propose DDRaM—a novel SDE integration method for DDPMs based on stochastic midpoints—introducing an auxiliary random midpoint to enhance discretization accuracy while preserving the standard DDPM sampling mechanism. Under Lipschitz smoothness of the score function, DDRaM achieves improved numerical fidelity without altering the underlying generative process.
Contribution/Results: We establish, for the first time, a sublinear iteration complexity bound for vanilla DDPM sampling: only Õ(√d) score evaluations suffice to attain ε-accuracy. This breaks prior limitations requiring ODE approximations or sampler modifications, better aligning with practical deployment. Our analysis integrates the “displacement composition rule” framework with log-concave sampling techniques. Empirical validation on pre-trained image generation models confirms substantial speedup.
📝 Abstract
SDE-based methods such as denoising diffusion probabilistic models (DDPMs) have shown remarkable success in real-world sample generation tasks. Prior analyses of DDPMs have been focused on the exponential Euler discretization, showing guarantees that generally depend at least linearly on the dimension or initial Fisher information. Inspired by works in log-concave sampling (Shen and Lee, 2019), we analyze an integrator -- the denoising diffusion randomized midpoint method (DDRaM) -- that leverages an additional randomized midpoint to better approximate the SDE. Using a recently-developed analytic framework called the"shifted composition rule", we show that this algorithm enjoys favorable discretization properties under appropriate smoothness assumptions, with sublinear $widetilde{O}(sqrt{d})$ score evaluations needed to ensure convergence. This is the first sublinear complexity bound for pure DDPM sampling -- prior works which obtained such bounds worked instead with ODE-based sampling and had to make modifications to the sampler which deviate from how they are used in practice. We also provide experimental validation of the advantages of our method, showing that it performs well in practice with pre-trained image synthesis models.