Vecchia Gaussian Process Ensembles on Internal Representations of Deep Neural Networks

📅 2023-05-26

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Deep neural networks (DNNs) excel at regression but lack reliable uncertainty quantification (UQ), whereas standard Gaussian processes (GPs) provide natural UQ yet suffer from poor scalability. Method: We propose Deep Vecchia Ensembles—a novel approach that first applies the Vecchia approximation within the latent feature space of pre-trained DNNs to construct GP ensembles, without modifying or retraining the base network. This mitigates UQ failure caused by feature collapse while preserving single-pass forward inference efficiency and enabling plug-and-play uncertainty estimation. Results: On multiple benchmark regression datasets, our method achieves superior predictive accuracy and uncertainty calibration compared to state-of-the-art deterministic UQ methods, with computational cost significantly lower than full GP inference. Its core innovation lies in decoupling scalable GP approximation from deep representation learning, yielding an efficient, robust, and retraining-free deep UQ framework.

📝 Abstract

For regression tasks, standard Gaussian processes (GPs) provide natural uncertainty quantification (UQ), while deep neural networks (DNNs) excel at representation learning. Deterministic UQ methods for neural networks have successfully combined the two and require only a single pass through the neural network. However, current methods necessitate changes to network training to address feature collapse, where unique inputs map to identical feature vectors. We propose an alternative solution, the deep Vecchia ensemble (DVE), which allows deterministic UQ to work in the presence of feature collapse, negating the need for network retraining. DVE comprises an ensemble of GPs built on hidden-layer outputs of a DNN, achieving scalability via Vecchia approximations that leverage nearest-neighbor conditional independence. DVE is compatible with pretrained networks and incurs low computational overhead. We demonstrate DVE's utility on several datasets and carry out experiments to understand the inner workings of the proposed method.

Problem

Research questions and friction points this paper is trying to address.

Combines Gaussian processes with deep neural networks for regression tasks.

Addresses feature collapse without requiring network retraining.

Provides scalable uncertainty quantification using Vecchia approximations.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble of GPs on DNN hidden layers

Vecchia approximations for scalability

Compatible with pretrained networks

🔎 Similar Papers

No similar papers found.