🤖 AI Summary
Existing decentralized stochastic gradient Langevin dynamics (SGLD) methods assume static network topologies and suffer from network-induced stationary sampling bias, even under full-batch settings. This limits their applicability to multi-agent Bayesian learning over time-varying communication graphs.
Method: We propose DSGLD—the first decentralized Langevin sampling algorithm supporting dynamic topologies—by integrating gradient tracking into the SGLD framework, combining the DIGing structure with stochastic gradient Langevin dynamics, and employing constant step sizes to achieve geometric convergence.
Contribution/Results: We provide the first finite-time Wasserstein-2 distance convergence guarantee for decentralized Langevin sampling, establishing that the iterates’ distribution approximates the target posterior within $O(sqrt{eta})$ error, where $eta$ is the step size. Experiments demonstrate that DSGLD significantly outperforms existing distributed sampling methods on time-varying networks, validating both theoretical claims and practical efficacy.
📝 Abstract
Sampling from a target distribution induced by training data is central to Bayesian learning, with Stochastic Gradient Langevin Dynamics (SGLD) serving as a key tool for scalable posterior sampling and decentralized variants enabling learning when data are distributed across a network of agents. This paper introduces DIGing-SGLD, a decentralized SGLD algorithm designed for scalable Bayesian learning in multi-agent systems operating over time-varying networks. Existing decentralized SGLD methods are restricted to static network topologies, and many exhibit steady-state sampling bias caused by network effects, even when full batches are used. DIGing-SGLD overcomes these limitations by integrating Langevin-based sampling with the gradient-tracking mechanism of the DIGing algorithm, originally developed for decentralized optimization over time-varying networks, thereby enabling efficient and bias-free sampling without a central coordinator. To our knowledge, we provide the first finite-time non-asymptotic Wasserstein convergence guarantees for decentralized SGLD-based sampling over time-varying networks, with explicit constants. Under standard strong convexity and smoothness assumptions, DIGing-SGLD achieves geometric convergence to an $O(sqrtη)$ neighborhood of the target distribution, where $η$ is the stepsize, with dependence on the target accuracy matching the best-known rates for centralized and static-network SGLD algorithms using constant stepsize. Numerical experiments on Bayesian linear and logistic regression validate the theoretical results and demonstrate the strong empirical performance of DIGing-SGLD under dynamically evolving network conditions.