🤖 AI Summary
Existing analyses of the EM algorithm suffer from difficulties in characterizing convergence rates and are largely restricted to local linear convergence, often relying on strong assumptions such as strong convexity or local strong monotonicity.
Method: We propose a unified analytical framework grounded in Wasserstein gradient flows and coordinate descent, modeling EM as a variational optimization process over the product space of Euclidean and probability measure spaces.
Contribution/Results: Under a generalized logarithmic Sobolev inequality, we establish, for the first time, global exponential convergence of EM and derive explicit convergence rates alongside finite-sample error bounds. Our framework naturally extends to key EM variants—including the Neal–Hinton EM—without requiring strong convexity or local strong monotonicity. It provides the first non-asymptotic, globally convergent theoretical foundation for high-dimensional latent-variable models, such as Gaussian mixture models and variational autoencoders.
📝 Abstract
By utilizing recently developed tools for constructing gradient flows on Wasserstein spaces, we extend an analysis technique commonly employed to understand alternating minimization algorithms on Euclidean space to the Expectation Maximization (EM) algorithm via its representation as coordinate-wise minimization on the product of a Euclidean space and a space of probability distributions due to Neal and Hinton (1998). In so doing we obtain finite sample error bounds and exponential convergence of the EM algorithm under a natural generalisation of a log-Sobolev inequality. We further demonstrate that the analysis technique is sufficiently flexible to allow also the analysis of several variants of the EM algorithm.