A Kernel Approach for Semi-implicit Variational Inference

📅 2026-01-17

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

This work addresses the bias in evidence lower bound (ELBO) optimization arising from non-integrable densities in semi-implicit variational inference, as well as the high computational cost of existing score-matching approaches that require nested optimization. The authors propose Kernelized Semi-Implicit Variational Inference (KSIVI), which introduces kernel methods into this framework for the first time. By leveraging explicit solutions in a reproducing kernel Hilbert space, KSIVI eliminates the need for inner-loop optimization and reformulates the objective as a kernel Stein discrepancy (KSD), enabling efficient stochastic gradient optimization. The method avoids complex min-max formulations, supports multi-layer hierarchical extensions to enhance expressiveness, and provides theoretical guarantees including a variance bound on gradient estimates and a statistical generalization error bound of order Õ(1/√n). Experiments on both synthetic and real-world Bayesian inference tasks demonstrate its effectiveness and scalability.

Technology Category

Application Category

📝 Abstract

Semi-implicit variational inference (SIVI) enhances the expressiveness of variational families through hierarchical semi-implicit distributions, but the intractability of their densities makes standard ELBO-based optimization biased. Recent score-matching approaches to SIVI (SIVI-SM) address this issue via a minimax formulation, at the expense of an additional lower-level optimization problem. In this paper, we propose kernel semi-implicit variational inference (KSIVI), a principled and tractable alternative that eliminates the lower-level optimization by leveraging kernel methods. We show that when optimizing over a reproducing kernel Hilbert space, the lower-level problem admits an explicit solution, reducing the objective to the kernel Stein discrepancy (KSD). Exploiting the hierarchical structure of semi-implicit distributions, the resulting KSD objective can be efficiently optimized using stochastic gradient methods. We establish optimization guarantees via variance bounds on Monte Carlo gradient estimators and derive statistical generalization bounds of order $\tilde{\mathcal{O}}(1/\sqrt{n})$. We further introduce a multi-layer hierarchical extension that improves expressiveness while preserving tractability. Empirical results on synthetic and real-world Bayesian inference tasks demonstrate the effectiveness of KSIVI.

Problem

Research questions and friction points this paper is trying to address.

semi-implicit variational inference

intractable densities

optimization bias

lower-level optimization

hierarchical distributions

Innovation

Methods, ideas, or system contributions that make the work stand out.

kernel Stein discrepancy

semi-implicit variational inference

reproducing kernel Hilbert space