🤖 AI Summary
Simulation-based inference (SBI) suffers from posterior misspecification when simulator models are inaccurate, leading to distributional mismatch between simulated and real observational data.
Method: We propose the first fully inductive SBI framework that enables robust inference without requiring access to batches of real data at test time. Our approach unifies calibration and distribution alignment—traditionally studied separately in domain adaptation—into a single end-to-end trainable paradigm. It introduces a closed-form, mini-batch optimal transport mechanism, integrated with semi-supervised calibration, conditional normalizing flows, and joint alignment strategies for both paired and unpaired samples to learn compact posterior representations.
Contribution/Results: The framework supports online inference and significantly improves generalization and scalability under model misspecification. Evaluated on synthetic benchmarks and real-world medical biomarker estimation tasks, it matches or surpasses state-of-the-art methods (e.g., RoPE) in accuracy, while achieving superior robustness and inference efficiency.
📝 Abstract
Simulation-based inference (SBI) is a statistical inference approach for estimating latent parameters of a physical system when the likelihood is intractable but simulations are available. In practice, SBI is often hindered by model misspecification--the mismatch between simulated and real-world observations caused by inherent modeling simplifications. RoPE, a recent SBI approach, addresses this challenge through a two-stage domain transfer process that combines semi-supervised calibration with optimal transport (OT)-based distribution alignment. However, RoPE operates in a fully transductive setting, requiring access to a batch of test samples at inference time, which limits scalability and generalization. We propose here a fully inductive and amortized SBI framework that integrates calibration and distributional alignment into a single, end-to-end trainable model. Our method leverages mini-batch OT with a closed-form coupling to align real and simulated observations that correspond to the same latent parameters, using both paired calibration data and unpaired samples. A conditional normalizing flow is then trained to approximate the OT-induced posterior, enabling efficient inference without simulation access at test time. Across a range of synthetic and real-world benchmarks--including complex medical biomarker estimation--our approach matches or surpasses the performance of RoPE, as well as other standard SBI and non-SBI estimators, while offering improved scalability and applicability in challenging, misspecified environments.