SoftCVI: contrastive variational inference with self-generated soft labels

📅 2024-07-22

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

In Bayesian inference, the posterior distribution is often known only up to an intractable normalization constant, limiting the performance of conventional variational inference (VI) and MCMC methods—especially under complex posterior geometries. To address this, we propose Soft-Label Contrastive Variational Inference (SL-CVI), which reformulates VI as an unsupervised contrastive learning task. SL-CVI leverages samples from the variational distribution to self-generate soft labels for training a parameterized classifier, eliminating the need for explicit positive/negative sample pairs. Crucially, its contrastive objective yields zero-variance gradients when the variational approximation is exact—a novel property that substantially improves training stability and convergence reliability. SL-CVI supports highly expressive variational families (e.g., normalizing flows) and consistently outperforms state-of-the-art VI methods across diverse Bayesian inference tasks, achieving superior posterior coverage, accuracy, and robustness.

Technology Category

Application Category

📝 Abstract

Estimating a distribution given access to its unnormalized density is pivotal in Bayesian inference, where the posterior is generally known only up to an unknown normalizing constant. Variational inference and Markov chain Monte Carlo methods are the predominant tools for this task; however, both are often challenging to apply reliably, particularly when the posterior has complex geometry. Here, we introduce Soft Contrastive Variational Inference (SoftCVI), which allows a family of variational objectives to be derived through a contrastive estimation framework. The approach parameterizes a classifier in terms of a variational distribution, reframing the inference task as a contrastive estimation problem aiming to identify a single true posterior sample among a set of samples. Despite this framing, we do not require positive or negative samples, but rather learn by sampling the variational distribution and computing ground truth soft classification labels from the unnormalized posterior itself. The objectives have zero variance gradient when the variational approximation is exact, without the need for specialized gradient estimators. We empirically investigate the performance on a variety of Bayesian inference tasks, using both simple (e.g. normal) and expressive (normalizing flow) variational distributions. We find that SoftCVI can be used to form objectives which are stable to train and mass-covering, frequently outperforming inference with other variational approaches.

Problem

Research questions and friction points this paper is trying to address.

Estimating unnormalized posterior distributions in Bayesian inference

Addressing challenges in variational inference for complex geometries

Developing stable, mass-covering objectives without explicit sample labels

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive variational inference with soft labels

Parameterizes classifier via variational distribution

Zero variance gradient for exact approximation

🔎 Similar Papers

A Primer on Variational Inference for Physics-Informed Deep Generative Modelling