🤖 AI Summary
This work addresses the problem of bounding the sampling error—measured in Wasserstein distance—for generative models based on critically damped Langevin diffusion (CLD). To this end, we propose a generalized dynamical model that introduces tunable noise hyperparameters in an extended state space, enabling controlled perturbation of data coordinates and thereby enhancing trajectory smoothness and sampling efficiency. Theoretically, we derive, for the first time, an explicit Wasserstein error bound under this framework, characterizing how hyperparameters govern convergence rate and sample quality. Methodologically, our approach unifies perspectives from statistical mechanics and Hamiltonian dynamics, integrating score matching with diffusion modeling to achieve superior generative performance. Empirical evaluation demonstrates that the proposed method consistently outperforms standard diffusion models in both error control and sample fidelity.
📝 Abstract
Score-based Generative Models (SGMs) have achieved impressive performance in data generation across a wide range of applications and benefit from strong theoretical guarantees. Recently, methods inspired by statistical mechanics, in particular, Hamiltonian dynamics, have introduced Critically-damped Langevin Diffusions (CLDs), which define diffusion processes on extended spaces by coupling the data with auxiliary variables. These approaches, along with their associated score-matching and sampling procedures, have been shown to outperform standard diffusion-based samplers numerically. In this paper, we analyze a generalized dynamic that extends classical CLDs by introducing an additional hyperparameter controlling the noise applied to the data coordinate, thereby better exploiting the extended space. We further derive a novel upper bound on the sampling error of CLD-based generative models in the Wasserstein metric. This additional hyperparameter influences the smoothness of sample paths, and our discretization error analysis provides practical guidance for its tuning, leading to improved sampling performance.