Towards Trustworthy Amortized Bayesian Model Comparison

📅 2025-08-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Under model misspecification, Bayesian model comparison (BMC) with neural surrogate models yields distorted results. To address this, we propose a calibration method based on a consistency loss: during simulation-based training, unlabeled real-world data is used to enforce probabilistic consistency of the surrogate’s outputs under data perturbations, thereby enhancing robustness to distributional shift in the true data-generating process. The method requires no access to the true likelihood and is compatible with standard BMC estimators such as bridge sampling. Experiments show that, when analytical likelihoods are available, the proposed loss significantly improves both calibration and ranking reliability of Bayesian evidence estimates; gains are marginal under purely neural likelihoods, confirming its critical utility under approximately accurate likelihood approximations. Our key contribution is the first integration of the self-consistency principle into amortized BMC surrogate training—enhancing trustworthy inference under model misspecification.

Technology Category

Application Category

📝 Abstract
Amortized Bayesian model comparison (BMC) enables fast probabilistic ranking of models via simulation-based training of neural surrogates. However, the reliability of neural surrogates deteriorates when simulation models are misspecified - the very case where model comparison is most needed. Thus, we supplement simulation-based training with a self-consistency (SC) loss on unlabeled real data to improve BMC estimates under empirical distribution shifts. Using a numerical experiment and two case studies with real data, we compare amortized evidence estimates with and without SC against analytic or bridge sampling benchmarks. SC improves calibration under model misspecification when having access to analytic likelihoods. However, it offers limited gains with neural surrogate likelihoods, making it most practical for trustworthy BMC when likelihoods are exact.
Problem

Research questions and friction points this paper is trying to address.

Improving reliability of neural surrogates under model misspecification
Addressing deterioration of Bayesian model comparison estimates
Enhancing calibration through self-consistency loss on real data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-consistency loss on unlabeled data
Neural surrogates for simulation-based training
Improved calibration under model misspecification
🔎 Similar Papers
No similar papers found.
Š
Šimon Kucharský
Department of Computational Statistics, Technical University Dortmund, Dortmund, Germany
A
Aayush Mishra
Department of Computational Statistics, Technical University Dortmund, Dortmund, Germany
D
Daniel Habermann
Department of Computational Statistics, Technical University Dortmund, Dortmund, Germany
Stefan T. Radev
Stefan T. Radev
Assistant Professor, Rensselaer Polytechnic Institute
Deep LearningBayesian StatisticsStochastic ModelsMachine LearningCognitive Modeling
Paul-Christian Bürkner
Paul-Christian Bürkner
Full Professor of Computational Statistics, TU Dortmund University
Bayesian StatisticsUncertainty QuantificationSimulation-Based InferencePrior Specification