🤖 AI Summary
This work addresses the limitations of traditional Bayesian optimal experimental design (BOED), which relies on expected information gain (EIG) based on Kullback–Leibler divergence and often performs poorly under support mismatch, tail underestimation, rare events, and high-dimensional settings. The authors propose the first integration of integral probability metrics (IPMs)—such as Wasserstein distance and maximum mean discrepancy—into BOED, replacing density-ratio-based objectives with geometrically aware discrepancy measures. This yields a stable, plug-and-play optimization framework that is more robust to prior misspecification and model inaccuracies, while remaining extensible to other geometric divergences. By combining a sample-driven strategy with neural optimal transport estimators, the method significantly outperforms conventional nested Monte Carlo and variational approaches in high-dimensional experiments, producing posterior credible sets that are both more concentrated and accurate.
📝 Abstract
Bayesian Optimal Experimental Design (BOED) provides a rigorous framework for decision-making tasks in which data acquisition is often the critical bottleneck, especially in resource-constrained settings. Traditionally, BOED typically selects designs by maximizing expected information gain (EIG), commonly defined through the Kullback-Leibler (KL) divergence. However, classical evaluation of EIG often involves challenging nested expectations, and even advanced variational methods leave the underlying log-density-ratio objective unchanged. As a result, support mismatch, tail underestimation, and rare-event sensitivity remain intrinsic concerns for KL-based BOED. To address these fundamental bottlenecks, we introduce an IPM-based BOED framework that replaces density-based divergences with integral probability metrics (IPMs), including the Wasserstein distance, Maximum Mean Discrepancy, and Energy Distance, resulting in a highly flexible plug-and-play BOED framework. We establish theoretical guarantees showing that IPM-based utilities provide stronger geometry-aware stability under surrogate-model error and prior misspecification than classical EIG-based utilities. We also validate the proposed framework empirically, demonstrating that IPM-based designs yield highly concentrated credible sets. Furthermore, by extending the same sample-based BOED template in a plug-and-play manner to geometry-aware discrepancies beyond the IPM class, illustrated by a neural optimal transport estimator, we achieve accurate optimal designs in high-dimensional settings where conventional nested Monte Carlo estimators and advanced variational methods fail.