🤖 AI Summary
This paper addresses the challenge of maximum marginal likelihood estimation in latent variable models. We propose JALA-EM, a novel method integrating nonequilibrium statistical mechanics with sequential Monte Carlo. Its core innovation is the first application of the Jarzynski equality to statistical inference, enabling a sampling framework based on weighted, unadjusted Langevin algorithms (ULA) with recursive weight updates, embedded within an EM variant. JALA-EM yields a recursive and scalable estimator of the marginal likelihood and provides non-asymptotic convergence guarantees under stochastic gradients. Experiments demonstrate that JALA-EM matches state-of-the-art methods in accuracy and efficiency across diverse latent variable models—including Gaussian mixture models, probabilistic PCA, and variational autoencoders—and achieves superior performance in model selection tasks.
📝 Abstract
We utilise a sampler originating from nonequilibrium statistical mechanics, termed here Jarzynski-adjusted Langevin algorithm (JALA), to build statistical estimation methods in latent variable models. We achieve this by leveraging Jarzynski's equality and developing algorithms based on a weighted version of the unadjusted Langevin algorithm (ULA) with recursively updated weights. Adapting this for latent variable models, we develop a sequential Monte Carlo (SMC) method that provides the maximum marginal likelihood estimate of the parameters, termed JALA-EM. Under suitable regularity assumptions on the marginal likelihood, we provide a nonasymptotic analysis of the JALA-EM scheme implemented with stochastic gradient descent and show that it provably converges to the maximum marginal likelihood estimate. We demonstrate the performance of JALA-EM on a variety of latent variable models and show that it performs comparably to existing methods in terms of accuracy and computational efficiency. Importantly, the ability to recursively estimate marginal likelihoods - an uncommon feature among scalable methods - makes our approach particularly suited for model selection, which we validate through dedicated experiments.