🤖 AI Summary
Particle-based variational inference (ParVI) suffers from limited exploration capability, low sampling efficiency, and high computational cost across repeated runs. To address these issues, we propose Semi-Implicit Function Gradient Flow (SIFG): a method that models the target distribution via a particle ensemble perturbed by Gaussian noise and estimates the gradient flow direction using denoising score matching. SIFG is the first to embed Gaussian perturbation into the function gradient flow framework, thereby enhancing the smoothness of the approximating family and improving theoretical convergence guarantees. We further introduce an adaptive noise scheduling mechanism to dynamically balance exploration breadth and approximation accuracy. Parameterized by neural networks, SIFG generates high-quality, diverse samples in a single run without reinitialization. Experiments on synthetic and real-world datasets demonstrate that SIFG significantly improves distribution approximation accuracy and sampling efficiency while accelerating convergence, validating its effectiveness and practicality.
📝 Abstract
Particle-based variational inference methods (ParVIs) use nonparametric variational families represented by particles to approximate the target distribution according to the kernelized Wasserstein gradient flow for the Kullback-Leibler (KL) divergence. Although functional gradient flows have been introduced to expand the kernel space for better flexibility, the deterministic updating mechanism may limit exploration and require expensive repetitive runs for new samples. In this paper, we propose Semi-Implicit Functional Gradient flow (SIFG), a functional gradient ParVI method that uses perturbed particles with Gaussian noise as the approximation family. We show that the corresponding functional gradient flow, which can be estimated via denoising score matching with neural networks, exhibits strong theoretical convergence guarantees due to a higher-order smoothness brought to the approximation family via Gaussian perturbation. In addition, we present an adaptive version of our method that automatically selects the appropriate noise magnitude during sampling, striking a good balance between exploration efficiency and approximation accuracy. Extensive experiments on both simulated and real-world datasets demonstrate the effectiveness and efficiency of the proposed framework.