🤖 AI Summary
This work addresses the high computational cost incurred by repeated simulations in hierarchical simulation-based inference by proposing Tokenized Flow Matching Posterior Estimation (TFMPE). The method integrates likelihood factorization with neural surrogate models, enabling training from single-site simulations and introducing, for the first time, tokenized flow matching into hierarchical Bayesian inference. This approach facilitates efficient amortized posterior estimation under functional observations. Experiments on a newly established hierarchical SBI benchmark, as well as on epidemiological and computational fluid dynamics models, demonstrate that TFMPE substantially reduces simulation overhead while yielding well-calibrated posterior distributions.
📝 Abstract
The cost of simulator evaluations is a key practical bottleneck for Simulation Based Inference (SBI). In hierarchical settings with shared global parameters and exchangeable site-level parameters and observations, this structure can be exploited to improve simulation efficiency. Existing hierarchical SBI approaches factorise the posterior yet still simulate across multiple sites per training sample; We instead explore likelihood factorisation (LF) to train from single-site simulations. In LF sampling we learn a per-site neural surrogate of the simulator and then assemble synthetic multi-site observations to amortise inference for the full hierarchical posterior. Building on this, we propose Tokenised Flow Matching for Posterior Estimation (TFMPE), a tokenised flow matching approach that supports function-valued observations through likelihood factorisation. To enable systematic evaluation, we introduce a benchmark for hierarchical SBI. We validate TFMPE on this benchmark and on realistic infectious disease and computational fluid dynamics models, finding well-calibrated posteriors while reducing computational cost.