🤖 AI Summary
To address the high memory and computational costs of training Bayesian neural networks (BNNs) on large-scale datasets, this paper proposes Variational Bayesian Pseudo-Coresets (VBPC). Unlike conventional Bayesian pseudo-coresets (BPCs), which rely on fixed or heuristic constructions, VBPC is the first method to integrate variational inference into pseudo-coreset construction. It jointly optimizes both the pseudo-samples and their underlying distribution via differentiable pseudo-data optimization and gradient-driven coreset learning, enabling efficient, learnable compression of the original data distribution. This approach substantially reduces memory requirements for posterior approximation: on multiple benchmark datasets, VBPC achieves over 30% reduction in GPU memory consumption compared to state-of-the-art BPC methods, while simultaneously improving predictive accuracy and uncertainty calibration.
📝 Abstract
The success of deep learning requires large datasets and extensive training, which can create significant computational challenges. To address these challenges, pseudo-coresets, small learnable datasets that mimic the entire data, have been proposed. Bayesian Neural Networks, which offer predictive uncertainty and probabilistic interpretation for deep neural networks, also face issues with large-scale datasets due to their high-dimensional parameter space. Prior works on Bayesian Pseudo-Coresets (BPC) attempt to reduce the computational load for computing weight posterior distribution by a small number of pseudo-coresets but suffer from memory inefficiency during BPC training and sub-optimal results. To overcome these limitations, we propose Variational Bayesian Pseudo-Coreset (VBPC), a novel approach that utilizes variational inference to efficiently approximate the posterior distribution, reducing memory usage and computational costs while improving performance across benchmark datasets.