π€ AI Summary
This work addresses the high computational cost of Monte Carlo estimation in Bayesian inverse problems, which often arises from large variances in quantities of interest under the posterior distribution. To mitigate this, the authors propose a conditional neural control variate method that learns generalizable control variates from joint samples of parameters and data to effectively reduce variance. The approach leverages a scalable neural architecture grounded in Steinβs identity, incorporating hierarchical coupling layers and tractable Jacobian trace computation, enabling generalization across different observational data without retraining. The required posterior score function can be derived from physical models, neural operators, or conditional normalizing flows. Demonstrated on a Darcy flow inverse problem, the method achieves substantial variance reduction even when using learned score approximations in place of analytical scores.
π Abstract
Bayesian inference for inverse problems involves computing expectations under posterior distributions -- e.g., posterior means, variances, or predictive quantities -- typically via Monte Carlo (MC) estimation. When the quantity of interest varies significantly under the posterior, accurate estimates demand many samples -- a cost often prohibitive for partial differential equation-constrained problems. To address this challenge, we introduce conditional neural control variates, a modular method that learns amortized control variates from joint model-data samples to reduce the variance of MC estimators. To scale to high-dimensional problems, we leverage Stein's identity to design an architecture based on an ensemble of hierarchical coupling layers with tractable Jacobian trace computation. Training requires: (i) samples from the joint distribution of unknown parameters and observed data; and (ii) the posterior score function, which can be computed from physics-based likelihood evaluations, neural operator surrogates, or learned generative models such as conditional normalizing flows. Once trained, the control variates generalize across observations without retraining. We validate our approach on stylized and partial differential equation-constrained Darcy flow inverse problems, demonstrating substantial variance reduction, even when the analytical score is replaced by a learned surrogate.