Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators

📅 2025-07-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In Bayesian neural networks, Hamiltonian Monte Carlo (HMC) posterior sampling suffers from prohibitive computational cost due to high-dimensional non-convexity, while variational inference (VI) and other approximations compromise the reliability of uncertainty quantification. To address this trade-off, we propose a variational-guided hybrid inference framework: first, jointly leveraging VI and parameter sensitivity analysis to identify a low-dimensional subset of parameters that dominantly influence predictive uncertainty; then, performing full-batch HMC exclusively within this compressed subspace. This work introduces the first principled use of VI to guide parameter-space dimensionality reduction for HMC, achieving a favorable balance between accuracy and efficiency. Experiments demonstrate substantial improvements in uncertainty quantification accuracy on large-scale neural networks and neural operators with tens of thousands to hundreds of thousands of parameters. Furthermore, the method successfully constructs a high-fidelity physics-informed surrogate model for hypersonic flow wall pressure prediction.

Technology Category

Application Category

📝 Abstract
Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian inference. However, HMC techniques are computationally demanding for Bayesian neural networks due to the high dimensionality of the network's parameter space and the non-convexity of their posterior distributions. Therefore, various approximation techniques, such as variational inference (VI) or stochastic gradient MCMC, are often employed to infer the posterior distribution of the network parameters. Such approximations introduce inaccuracies in the inferred distributions, resulting in unreliable uncertainty estimates. In this work, we propose a hybrid approach that combines inexpensive VI and accurate HMC methods to efficiently and accurately quantify uncertainties in neural networks and neural operators. The proposed approach leverages an initial VI training on the full network. We examine the influence of individual parameters on the prediction uncertainty, which shows that a large proportion of the parameters do not contribute substantially to uncertainty in the network predictions. This information is then used to significantly reduce the dimension of the parameter space, and HMC is performed only for the subset of network parameters that strongly influence prediction uncertainties. This yields a framework for accelerating the full batch HMC for posterior inference in neural networks. We demonstrate the efficiency and accuracy of the proposed framework on deep neural networks and operator networks, showing that inference can be performed for large networks with tens to hundreds of thousands of parameters. We show that this method can effectively learn surrogates for complex physical systems by modeling the operator that maps from upstream conditions to wall-pressure data on a cone in hypersonic flow.
Problem

Research questions and friction points this paper is trying to address.

Accelerating HMC for Bayesian neural networks
Reducing parameter space dimension for efficiency
Improving uncertainty estimates in neural operators
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid approach combining VI and HMC
Reduces parameter space dimension significantly
Accelerates HMC for large neural networks
🔎 Similar Papers
No similar papers found.
Ponkrshnan Thiagarajan
Ponkrshnan Thiagarajan
Johns Hopkins University
Uncertainty quantificationBayesian methodsMachine learningComputational Mechanics
T
Tamer A. Zaki
Hopkins Extreme Materials Institute, Johns Hopkins University, Baltimore, MD, USA; Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, USA
Michael D. Shields
Michael D. Shields
Johns Hopkins University
uncertainty quantificationcomputational mechanicsstochastic processes