Accelerated Regularized Wasserstein Proximal Sampling Algorithms

πŸ“… 2026-01-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the slow convergence and insufficient particle diversity encountered in sampling from non-log-concave and ill-conditioned Gibbs distributions by proposing the Accelerated Regularized Wasserstein Proximal (ARWP) method. ARWP introduces, for the first time, a Nesterov-type acceleration mechanism into particle-based sampling dynamics, modeling the evolution via a second-order fractional ordinary differential equation and incorporating regularized Wasserstein proximal updates to realize an accelerated information gradient flow in Euclidean space. Theoretical analysis demonstrates that ARWP substantially improves asymptotic mixing rates and enhances exploration of distribution tails. Empirical evaluations on multimodal Gaussian mixtures, the Rosenbrock distribution, and non-log-concave Bayesian neural networks confirm the method’s superior performance in discrete-time mixing speed, structured particle convergence, and generalization capability.

Technology Category

Application Category

πŸ“ Abstract
We consider sampling from a Gibbs distribution by evolving a finite number of particles using a particular score estimator rather than Brownian motion. To accelerate the particles, we consider a second-order score-based ODE, similar to Nesterov acceleration. In contrast to traditional kernel density score estimation, we use the recently proposed regularized Wasserstein proximal method, yielding the Accelerated Regularized Wasserstein Proximal method (ARWP). We provide a detailed analysis of continuous- and discrete-time non-asymptotic and asymptotic mixing rates for Gaussian initial and target distributions, using techniques from Euclidean acceleration and accelerated information gradients. Compared with the kinetic Langevin sampling algorithm, the proposed algorithm exhibits a higher contraction rate in the asymptotic time regime. Numerical experiments are conducted across various low-dimensional experiments, including multi-modal Gaussian mixtures and ill-conditioned Rosenbrock distributions. ARWP exhibits structured and convergent particles, accelerated discrete-time mixing, and faster tail exploration than the non-accelerated regularized Wasserstein proximal method and kinetic Langevin methods. Additionally, ARWP particles exhibit better generalization properties for some non-log-concave Bayesian neural network tasks.
Problem

Research questions and friction points this paper is trying to address.

sampling
Gibbs distribution
non-log-concave
accelerated mixing
Wasserstein proximal
Innovation

Methods, ideas, or system contributions that make the work stand out.

Accelerated Sampling
Regularized Wasserstein Proximal
Score-based ODE
Non-asymptotic Mixing Rate
Nesterov Acceleration
πŸ”Ž Similar Papers
No similar papers found.
Hong Ye Tan
Hong Ye Tan
Hedrick Assistant Adjunct Professor, UCLA
Machine LearningOptimizationInverse Problems
S
Stanley Osher
Department of Mathematics, University of California, Los Angeles, CA 90095
W
Wuchen Li
Department of Mathematics, University of South Carolina, Columbia, SC 29208