Characterizing Dependence of Samples along the Langevin Dynamics and Algorithms via Contraction of $Phi$-Mutual Information

📅 2024-02-26

📈 Citations: 0

✨ Influential: 0

career value

281K/year

🤖 AI Summary

This work quantifies the decay rate of inter-sample dependence in continuous-space MCMC sampling—specifically Langevin dynamics, the unadjusted Langevin algorithm (ULA), and the proximal sampler—under strongly log-concave target distributions. Methodologically, the authors develop the first Φ-mutual information contraction framework along non-stationary Markov trajectories, extending the strong data processing inequality (SDPI) to the full path and establishing a theoretical link between Φ-Sobolev inequalities and sampling dependence. Their key contribution is a rigorous proof that the Φ-mutual information between initial and final samples decays exponentially with iteration count. This yields the first unified, quantitative guarantee of asymptotic sample independence for these widely used samplers, substantially advancing the theoretical understanding of MCMC mixing times and dependence structure.

Technology Category

Application Category

📝 Abstract

The mixing time of a Markov chain determines how fast the iterates of the Markov chain converge to the stationary distribution; however, it does not control the dependencies between samples along the Markov chain. In this paper, we study the question of how fast the samples become approximately independent along popular Markov chains for continuous-space sampling: the Langevin dynamics in continuous time, and the Unadjusted Langevin Algorithm and the Proximal Sampler in discrete time. We measure the dependence between samples via $Phi$-mutual information, which is a broad generalization of the standard mutual information, and which is equal to $0$ if and only if the the samples are independent. We show that along these Markov chains, the $Phi$-mutual information between the first and the $k$-th iterate decreases to $0$ exponentially fast in $k$ when the target distribution is strongly log-concave. Our proof technique is based on showing the Strong Data Processing Inequalities (SDPIs) hold along the Markov chains. To prove fast mixing of the Markov chains, we only need to show the SDPIs hold for the stationary distribution. In contrast, to prove the contraction of $Phi$-mutual information, we need to show the SDPIs hold along the entire trajectories of the Markov chains; we prove this when the iterates along the Markov chains satisfy the corresponding $Phi$-Sobolev inequality, which is implied by the strong log-concavity of the target distribution.

Problem

Research questions and friction points this paper is trying to address.

Analyzes sample dependence in Markov chains

Measures dependence via Φ-mutual information

Proves exponential decay of dependence in log-concave distributions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Φ-mutual information for dependence measurement

Applies Strong Data Processing Inequalities (SDPIs)

Requires Φ-Sobolev inequality for proof

🔎 Similar Papers

A noise-corrected Langevin algorithm and sampling by half-denoising