Information-Preserving Domain Transfer with Unlabeled Data in Misspecified Simulation-Based Inference

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

196K/year
🤖 AI Summary
This work addresses the performance degradation of simulation-based inference (SBI) under model misspecification, where mismatches between simulated and real data distributions impair inference accuracy. To mitigate this issue, the authors propose SPIN, a novel framework that, for the first time, explicitly preserves the mutual information between parameters and observations in unsupervised domain adaptation. SPIN achieves reversible domain alignment through an unpaired, label-free cycle mapping—simulated to real and back to simulated—driven solely by real observational data. By avoiding marginal distribution alignment alone, which risks discarding critical structural information, SPIN ensures that the adapted data remain suitable for accurate Bayesian inference. Experiments demonstrate that SPIN substantially improves posterior accuracy across diverse synthetic and physical real-world benchmarks, with its advantages becoming more pronounced as model misspecification intensifies.
📝 Abstract
Simulation-based inference (SBI) provides amortized Bayesian parameter inference from simulator-generated data without requiring explicit likelihood evaluation. Its reliability can degrade under model misspecification, where real-world observations are not well represented by the simulator used for training. Existing methods using unlabeled real-world data often align simulated and real-world data distributions, but marginal alignment alone does not directly preserve parameter-relevant information needed for posterior inference. We propose SPIN, an SBI framework with parameter-relevant information-preserving domain transfer using unlabeled, unpaired real-world observations. During training, SPIN translates labeled simulator observations toward the real-world domain and back to the simulator domain, using the original simulator labels to encourage domain transfer that preserves parameter-relevant mutual information. At test time, the learned real-to-simulator transport maps real-world observations into the simulator domain for posterior inference, without requiring real-world parameter labels or paired real--simulator observations. Across controlled synthetic and physical real-world benchmarks, SPIN improves real-world posterior inference, with the improvement becoming clearer as misspecification increases.
Problem

Research questions and friction points this paper is trying to address.

simulation-based inference
model misspecification
domain transfer
unlabeled data
information preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

simulation-based inference
domain transfer
information preservation
model misspecification
unlabeled data
J
Joon Jang
Department of Biomedical Sciences, Seoul National University, Seoul, Republic of Korea
E
Eunho Jeong
Department of Applied Bioengineering, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Republic of Korea
Kyu Sung Choi
Kyu Sung Choi
Assistant Professor, Department of Radiology, Seoul National University Hospital
RadiologyNeuroimageDeep LearningNeuro-Oncology
H
Hyeonjin Kim
Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea; Department of Medical Sciences, Seoul National University, Seoul, Republic of Korea