🤖 AI Summary
This work addresses the optimization of noise scheduling and time discretization in diffusion models under a limited number of sampling steps to minimize the discrepancy between the generated and target distributions. By constructing a simplified diffusion model with a Gaussian source distribution, the authors derive a closed-form solution for the reverse process and analyze discretization error using KL divergence and the Euler–Maclaurin expansion. Leveraging variational calculus, they propose a “tangent law” for noise scheduling, whose parameters are analytically determined by the eigen-spectrum of the source covariance matrix. This approach provides an optimal time discretization criterion for pre-trained models without requiring retraining and consistently outperforms existing baselines across multiple datasets and architectures, particularly excelling under extremely low sampling budgets.
📝 Abstract
An elementary approach to characterizing the impact of noise scheduling and time discretization in generative diffusion models is developed. We first utilize the Cram\'er-Rao bound to identify the Gaussian setting as a fundamental performance limit, necessitating its study as a reference. Building on this insight, we consider a simplified model in which the source distribution is a multivariate Gaussian with a given covariance matrix, together with the deterministic reverse sampling process. The explicit closed-form evolution trajectory of the distributions across reverse sampling steps is derived, and consequently, the Kullback-Leibler (KL) divergence between the source distribution and the reverse sampling output is obtained. The effect of the number of time discretization steps on the convergence of this KL divergence is studied via the Euler-Maclaurin expansion. An optimization problem is formulated, and its solution noise schedule is obtained via calculus of variations, shown to follow a tangent law whose coefficient is determined by the eigenvalues of the source covariance matrix. For an alternative scenario, more realistic in practice, where pretrained models have been obtained for some given noise schedules, the KL divergence also provides a measure to compare different time discretization strategies in reverse sampling. Experiments across different datasets and pretrained models demonstrate that the time discretization strategy selected by our approach consistently outperforms baseline and search-based strategies, particularly when the budget on the number of function evaluations is very tight.