🤖 AI Summary
Gradient-based MCMC methods fail for high-dimensional nonsmooth Bayesian posteriors—common in sparse signal recovery and image denoising—due to the lack of differentiability. To address this, we propose proximal Hamiltonian Monte Carlo (p-HMC), the first method to integrate proximal operators from convex optimization into HMC dynamics. By leveraging the Moreau–Yosida envelope, p-HMC constructs a smooth surrogate potential that yields exact gradient approximations for nonsmooth posteriors. We establish its geometric ergodicity and demonstrate substantially reduced computational complexity. Experiments on logistic regression and low-rank matrix estimation show that p-HMC achieves faster convergence and higher sampling efficiency than state-of-the-art MCMC methods. Crucially, p-HMC bridges convex optimization and Bayesian sampling paradigms, enabling principled posterior inference for nonsmooth models without sacrificing theoretical guarantees or practical performance.
📝 Abstract
Bayesian formulation of modern day signal processing problems has called for improved Markov chain Monte Carlo (MCMC) sampling algorithms for inference. The need for efficient sampling techniques has become indispensable for high dimensional distributions that often characterize many core signal processing problems, e.g., image denoising, sparse signal recovery, etc. A major issue in building effective sampling strategies, however, is the non-differentiability of the underlying posterior density. Such posteriors are popular in models designed to recover sparse signals. As a result, the use of efficient gradient-based MCMC sampling techniques becomes difficult. We circumvent this problem by proposing a Proximal Hamiltonian Monte Carlo (p-HMC) algorithm, which leverages elements from convex optimization like proximal mappings and Moreau-Yosida (MY) envelopes within Hamiltonian dynamics. Our method improves upon the current state of the art non-smooth Hamiltonian Monte Carlo as it achieves a relatively sharper approximation of the gradient of log posterior density and a computational burden of at most the current state-of-the-art. A chief contribution of this work is the theoretical analysis of p-HMC. We provide conditions for geometric ergodicity of the underlying HMC chain. On the practical front, we propose guidance on choosing the key p-HMC hyperparameter -- the regularization parameter in the MY-envelope. We demonstrate p-HMC's efficiency over other MCMC algorithms on benchmark problems of logistic regression and low-rank matrix estimation.