Towards Trustworthy Amortized Bayesian Model Comparison

📅 2025-08-28

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Under model misspecification, Bayesian model comparison (BMC) with neural surrogate models yields distorted results. To address this, we propose a calibration method based on a consistency loss: during simulation-based training, unlabeled real-world data is used to enforce probabilistic consistency of the surrogate’s outputs under data perturbations, thereby enhancing robustness to distributional shift in the true data-generating process. The method requires no access to the true likelihood and is compatible with standard BMC estimators such as bridge sampling. Experiments show that, when analytical likelihoods are available, the proposed loss significantly improves both calibration and ranking reliability of Bayesian evidence estimates; gains are marginal under purely neural likelihoods, confirming its critical utility under approximately accurate likelihood approximations. Our key contribution is the first integration of the self-consistency principle into amortized BMC surrogate training—enhancing trustworthy inference under model misspecification.

Technology Category

Application Category

📝 Abstract

Amortized Bayesian model comparison (BMC) enables fast probabilistic ranking of models via simulation-based training of neural surrogates. However, the reliability of neural surrogates deteriorates when simulation models are misspecified - the very case where model comparison is most needed. Thus, we supplement simulation-based training with a self-consistency (SC) loss on unlabeled real data to improve BMC estimates under empirical distribution shifts. Using a numerical experiment and two case studies with real data, we compare amortized evidence estimates with and without SC against analytic or bridge sampling benchmarks. SC improves calibration under model misspecification when having access to analytic likelihoods. However, it offers limited gains with neural surrogate likelihoods, making it most practical for trustworthy BMC when likelihoods are exact.

Problem

Research questions and friction points this paper is trying to address.

Improving reliability of neural surrogates under model misspecification

Addressing deterioration of Bayesian model comparison estimates

Enhancing calibration through self-consistency loss on real data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-consistency loss on unlabeled data

Neural surrogates for simulation-based training

Improved calibration under model misspecification

🔎 Similar Papers

Amortized Bayesian Workflow