🤖 AI Summary
Existing convergence analyses of Score-Based Generative Models (SGMs) under the Wasserstein-2 (W₂) distance rely on strong assumptions—such as log-concavity of the data distribution and high-order regularity of the score function—limiting theoretical applicability.
Method: We introduce a novel theoretical framework leveraging the regularization effect of the Ornstein-Uhlenbeck (OU) process: we prove that weak log-concavity propagates into strict log-concavity over time, and uncover an alternating contraction structure in the reverse OU drift. By integrating Hamilton–Jacobi–Bellman equations with PDE-based analysis of log-density evolution, we eliminate dependence on score regularity.
Contribution/Results: Our analysis significantly relaxes convergence conditions, providing the first provable W₂ convergence guarantees for non-convex, non-smooth distributions—including Gaussian mixtures—thereby extending the theoretical scope of SGMs beyond prior restrictive settings.
📝 Abstract
Score-based Generative Models (SGMs) aim to sample from a target distribution by learning score functions using samples perturbed by Gaussian noise. Existing convergence bounds for SGMs in the $mathcal{W}_2$-distance rely on stringent assumptions about the data distribution. In this work, we present a novel framework for analyzing $mathcal{W}_2$-convergence in SGMs, significantly relaxing traditional assumptions such as log-concavity and score regularity. Leveraging the regularization properties of the Ornstein-Uhlenbeck (OU) process, we show that weak log-concavity of the data distribution evolves into log-concavity over time. This transition is rigorously quantified through a PDE-based analysis of the Hamilton-Jacobi-Bellman equation governing the log-density of the forward process. Moreover, we establish that the drift of the time-reversed OU process alternates between contractive and non-contractive regimes, reflecting the dynamics of concavity. Our approach circumvents the need for stringent regularity conditions on the score function and its estimators, relying instead on milder, more practical assumptions. We demonstrate the wide applicability of this framework through explicit computations on Gaussian mixture models, illustrating its versatility and potential for broader classes of data distributions.