🤖 AI Summary
Model misspecification poses a critical challenge in simulation-based inference (SBI), often leading to biased posteriors and miscalibrated uncertainty estimates. To address this, we propose a two-stage robust SBI framework: in the first stage, a flow-matching-based simulator is trained to approximate the posterior; in the second stage, a small set of real-world observations drives a distributional shift correction—adaptively refining the posterior without assuming any specific form of misspecification. This work introduces flow matching to SBI calibration for the first time, achieving both scalability and robustness to distributional shifts. Experiments on synthetic and real-world benchmark datasets demonstrate that our method significantly improves posterior accuracy and calibration quality over state-of-the-art baselines, while maintaining computational efficiency.
📝 Abstract
Simulation-based inference (SBI) is transforming experimental sciences by enabling parameter estimation in complex non-linear models from simulated data. A persistent challenge, however, is model misspecification: simulators are only approximations of reality, and mismatches between simulated and real data can yield biased or overconfident posteriors. We address this issue by introducing Flow Matching Corrected Posterior Estimation (FMCPE), a framework that leverages the flow matching paradigm to refine simulation-trained posterior estimators using a small set of real calibration samples. Our approach proceeds in two stages: first, a posterior approximator is trained on abundant simulated data; second, flow matching transports its predictions toward the true posterior supported by real observations, without requiring explicit knowledge of the misspecification. This design enables FMCPE to combine the scalability of SBI with robustness to distributional shift. Across synthetic benchmarks and real-world datasets, we show that our proposal consistently mitigates the effects of misspecification, delivering improved inference accuracy and uncertainty calibration compared to standard SBI baselines, while remaining computationally efficient.