Scalable Bayesian Learning with posteriors

📅 2024-05-31

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

168K/year

🤖 AI Summary

To address the scalability challenges of Bayesian learning under big data and large models—stemming from high-dimensional posterior approximation—this paper proposes a scalable Bayesian inference framework. Methodologically, it introduces a novel tempered stochastic gradient MCMC perspective, theoretically establishing the asymptotic unbiasedness of deep ensembles. It further provides the first systematic empirical validation of the cold posterior effect in large language models (LLMs), demonstrating improved uncertainty calibration and robustness via Bayesian approximation. Finally, it develops Posteriors, an open-source PyTorch library implementing a unified optimization-and-sampling paradigm, enabling efficient Bayesian inference for models with up to thousands of layers. Experiments across multiple benchmarks and LLM tasks show significant gains in predictive uncertainty calibration and out-of-distribution robustness.

Technology Category

Application Category

📝 Abstract

Although theoretically compelling, Bayesian learning with modern machine learning models is computationally challenging since it requires approximating a high dimensional posterior distribution. In this work, we (i) introduce posteriors, an easily extensible PyTorch library hosting general-purpose implementations making Bayesian learning accessible and scalable to large data and parameter regimes; (ii) present a tempered framing of stochastic gradient Markov chain Monte Carlo, as implemented in posteriors, that transitions seamlessly into optimization and unveils a minor modification to deep ensembles to ensure they are asymptotically unbiased for the Bayesian posterior, and (iii) demonstrate and compare the utility of Bayesian approximations through experiments including an investigation into the cold posterior effect and applications with large language models.

Problem

Research questions and friction points this paper is trying to address.

Scalable Bayesian learning for high-dimensional models

Extensible PyTorch library for Bayesian methods

Improving deep ensembles for unbiased posterior estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

PyTorch library for scalable Bayesian learning

Tempered stochastic gradient MCMC method

Deep ensembles modification for unbiased posterior

🔎 Similar Papers

Bayesian Low-Rank LeArning (Bella): A Practical Approach to Bayesian Neural Networks