🤖 AI Summary
This work quantifies the decay rate of inter-sample dependence in continuous-space MCMC sampling—specifically Langevin dynamics, the unadjusted Langevin algorithm (ULA), and the proximal sampler—under strongly log-concave target distributions. Methodologically, the authors develop the first Φ-mutual information contraction framework along non-stationary Markov trajectories, extending the strong data processing inequality (SDPI) to the full path and establishing a theoretical link between Φ-Sobolev inequalities and sampling dependence. Their key contribution is a rigorous proof that the Φ-mutual information between initial and final samples decays exponentially with iteration count. This yields the first unified, quantitative guarantee of asymptotic sample independence for these widely used samplers, substantially advancing the theoretical understanding of MCMC mixing times and dependence structure.
📝 Abstract
The mixing time of a Markov chain determines how fast the iterates of the Markov chain converge to the stationary distribution; however, it does not control the dependencies between samples along the Markov chain. In this paper, we study the question of how fast the samples become approximately independent along popular Markov chains for continuous-space sampling: the Langevin dynamics in continuous time, and the Unadjusted Langevin Algorithm and the Proximal Sampler in discrete time. We measure the dependence between samples via $Phi$-mutual information, which is a broad generalization of the standard mutual information, and which is equal to $0$ if and only if the the samples are independent. We show that along these Markov chains, the $Phi$-mutual information between the first and the $k$-th iterate decreases to $0$ exponentially fast in $k$ when the target distribution is strongly log-concave. Our proof technique is based on showing the Strong Data Processing Inequalities (SDPIs) hold along the Markov chains. To prove fast mixing of the Markov chains, we only need to show the SDPIs hold for the stationary distribution. In contrast, to prove the contraction of $Phi$-mutual information, we need to show the SDPIs hold along the entire trajectories of the Markov chains; we prove this when the iterates along the Markov chains satisfy the corresponding $Phi$-Sobolev inequality, which is implied by the strong log-concavity of the target distribution.