🤖 AI Summary
This work investigates the non-asymptotic convergence of Particle Gradient Descent (PGD) for maximum likelihood estimation in large latent-variable models. Addressing free energy functional optimization, we introduce a unified generalization of logarithmic Sobolev and Polyak–Łojasiewicz-type conditions, establishing their first equivalence with Talagrand’s inequality and quadratic growth—thereby extending the Bakry–Émery theory. Leveraging tools from optimal transport, information geometry, and stochastic differential equations, we prove that under strong concavity of the log-likelihood, PGD—implemented as a discrete-time approximation of the free energy gradient flow—achieves exponential convergence. Moreover, we derive the first tight non-asymptotic upper bound on the discretization error. These results provide foundational theoretical guarantees for PGD in latent-variable modeling, marking a key advance in the rigorous analysis of particle-based variational inference methods.
📝 Abstract
We prove non-asymptotic error bounds for particle gradient descent (PGD)~(Kuntz et al., 2023), a recently introduced algorithm for maximum likelihood estimation of large latent variable models obtained by discretizing a gradient flow of the free energy. We begin by showing that, for models satisfying a condition generalizing both the log-Sobolev and the Polyak--{L}ojasiewicz inequalities (LSI and P{L}I, respectively), the flow converges exponentially fast to the set of minimizers of the free energy. We achieve this by extending a result well-known in the optimal transport literature (that the LSI implies the Talagrand inequality) and its counterpart in the optimization literature (that the P{L}I implies the so-called quadratic growth condition), and applying it to our new setting. We also generalize the Bakry--'Emery Theorem and show that the LSI/P{L}I generalization holds for models with strongly concave log-likelihoods. For such models, we further control PGD's discretization error, obtaining non-asymptotic error bounds. While we are motivated by the study of PGD, we believe that the inequalities and results we extend may be of independent interest.