🤖 AI Summary
To address the limited expressiveness and high computational cost of mean-field approximations in posterior inference for deep Gaussian processes (DGPs), this paper proposes Neural Operator Variational Inference (NOVI) based on Regularized Stein Discrepancy (RSD). NOVI employs neural operators as generative models to directly minimize RSD between the generated distribution and the true posterior in the $L_2$ space—marking the first integration of neural operators into DGP variational inference. By imposing a Fisher divergence constraint, NOVI achieves controllable bias and robust error control, effectively circumventing the restrictive mean-field assumption. Extensive experiments on datasets ranging from hundreds to millions of samples demonstrate its efficacy: on CIFAR-10 classification, NOVI achieves 93.56% accuracy—substantially outperforming existing GP-based methods—while exhibiting faster convergence and simultaneously improved stability and predictive accuracy.
📝 Abstract
Deep Gaussian process (DGP) models offer a powerful nonparametric approach for Bayesian inference, but exact inference is typically intractable, motivating the use of various approximations. However, existing approaches, such as mean-field Gaussian assumptions, limit the expressiveness and efficacy of DGP models, while stochastic approximation can be computationally expensive. To tackle these challenges, we introduce neural operator variational inference (NOVI) for DGPs. NOVI uses a neural generator to obtain a sampler and minimizes the regularized Stein discrepancy (RSD) between the generated distribution and true posterior in $mathcal {L}_{2}$ space. We solve the minimax problem using Monte Carlo estimation and subsampling stochastic optimization techniques and demonstrate that the bias introduced by our method can be controlled by multiplying the Fisher divergence with a constant, which leads to robust error control and ensures the stability and precision of the algorithm. Our experiments on datasets ranging from hundreds to millions demonstrate the effectiveness and the faster convergence rate of the proposed method. We achieve a classification accuracy of 93.56 on the CIFAR10 dataset, outperforming state-of-the-art (SOTA) Gaussian process (GP) methods. We are optimistic that NOVI possesses the potential to enhance the performance of deep Bayesian nonparametric models and could have significant implications for various practical applications.