Scalable Generalized Bayesian Online Neural Network Training for Sequential Decision Making

📅 2025-06-13

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

To address scalability challenges in online learning of neural network parameters and generalized Bayesian inference for sequential decision-making tasks, this paper proposes a real-time, full-parameter update framework that eliminates reliance on replay buffers or offline retraining. Methodologically, it integrates frequency-domain filtering with Bayesian inference, employing a block-diagonal covariance approximation and a hybrid error-covariance update scheme—low-rank updates for hidden layers and full-rank updates for the output layer—to construct well-calibrated posterior predictive distributions. Crucially, it provides the first theoretical guarantee of decision reliability without assuming a proper posterior. Experiments on non-stationary multi-armed bandits and Bayesian optimization demonstrate that the method achieves state-of-the-art performance by simultaneously attaining high inference efficiency and decision accuracy.

Technology Category

Application Category

📝 Abstract

We introduce scalable algorithms for online learning and generalized Bayesian inference of neural network parameters, designed for sequential decision making tasks. Our methods combine the strengths of frequentist and Bayesian filtering, which include fast low-rank updates via a block-diagonal approximation of the parameter error covariance, and a well-defined posterior predictive distribution that we use for decision making. More precisely, our main method updates a low-rank error covariance for the hidden layers parameters, and a full-rank error covariance for the final layer parameters. Although this characterizes an improper posterior, we show that the resulting posterior predictive distribution is well-defined. Our methods update all network parameters online, with no need for replay buffers or offline retraining. We show, empirically, that our methods achieve a competitive tradeoff between speed and accuracy on (non-stationary) contextual bandit problems and Bayesian optimization problems.

Problem

Research questions and friction points this paper is trying to address.

Scalable online training for neural networks in sequential decision making

Combining frequentist and Bayesian methods for efficient parameter updates

Achieving speed-accuracy balance in non-stationary decision tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines frequentist and Bayesian filtering techniques

Uses low-rank and full-rank error covariance updates

Online updates without replay buffers or retraining

🔎 Similar Papers

Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit Approach