Investigating Batch Inference in a Sequential Monte Carlo Framework for Neural Networks

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the computational intractability of posterior inference in Bayesian neural networks, which hinders scalability. Traditional sequential Monte Carlo (SMC) methods rely on full-batch data, incurring prohibitive computational costs. To overcome this limitation, the authors propose a data annealing strategy that incrementally incorporates mini-batches within the SMC framework, enabling progressive updates to the likelihood and gradient estimates. This approach represents the first effective integration of mini-batch processing with SMC sampling. By doing so, it achieves substantial gains in computational efficiency while preserving sampling accuracy. Empirical evaluations on standard image classification benchmarks demonstrate up to a six-fold speedup compared to conventional SMC, with negligible degradation in model accuracy.

Technology Category

Application Category

📝 Abstract

Bayesian inference allows us to define a posterior distribution over the weights of a generic neural network (NN). Exact posteriors are usually intractable, in which case approximations can be employed. One such approximation - variational inference - is computationally efficient when using mini-batch stochastic gradient descent as subsets of the data are used for likelihood and gradient evaluations, though the approach relies on the selection of a variational distribution which sufficiently matches the form of the posterior. Particle-based methods such as Markov chain Monte Carlo and Sequential Monte Carlo (SMC) do not assume a parametric family for the posterior by typically require higher computational cost. These sampling methods typically use the full-batch of data for likelihood and gradient evaluations, which contributes to this computational expense. We explore several methods of gradually introducing more mini-batches of data (data annealing) into likelihood and gradient evaluations of an SMC sampler. We find that we can achieve up to $6\times$ faster training with minimal loss in accuracy on benchmark image classification problems using NNs.

Problem

Research questions and friction points this paper is trying to address.

Bayesian inference

Sequential Monte Carlo

neural networks

batch inference

computational cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential Monte Carlo

mini-batch inference

data annealing