🤖 AI Summary
Variational inference (VI) often fails to recover the true mean and covariance of target distributions exhibiting even or ellipsoidal symmetry—including partial coordinate symmetries—especially under f-divergences beyond KL divergence, where existing theoretical guarantees are limited to global symmetry assumptions.
Method: We generalize symmetry-based theoretical guarantees to the broad class of f-divergences and develop a unified analytical framework for location-scale variational families.
Contribution/Results: We prove that, provided the variational family respects the inherent symmetry of the target, VI yields unbiased estimates of the mean and covariance under a wide range of f-divergences—including those commonly used in practice. Notably, our result applies to hierarchical Bayesian models with structured, non-globally-symmetric posteriors—a setting where prior theory breaks down. Empirical evaluation on realistic hierarchical models confirms statistical consistency and robustness of the proposed approach across diverse f-divergences.
📝 Abstract
We extend several recent results providing symmetry-based guarantees for variational inference (VI) with location-scale families. VI approximates a target density~$p$ by the best match $q^*$ in a family $Q$ of tractable distributions that in general does not contain $p$. It is known that VI can recover key properties of $p$, such as its mean and correlation matrix, when $p$ and $Q$ exhibit certain symmetries and $q^*$ is found by minimizing the reverse Kullback-Leibler divergence. We extend these guarantees in two important directions. First, we provide symmetry-based guarantees for a broader family of divergences, highlighting the properties of variational objectives under which VI provably recovers the mean and correlation matrix. Second, we obtain further guarantees for VI when the target density $p$ exhibits even and elliptical symmetries in some but not all of its coordinates. These partial symmetries arise naturally in Bayesian hierarchical models, where the prior induces a challenging geometry but still possesses axes of symmetry. We illustrate these theoretical results in a number of experimental settings.