π€ AI Summary
Normalizing flow-based neural posterior estimation (NPE) under black-box likelihoods suffers from training instability and a fundamental trade-off between expressive power and computational cost. Method: This paper introduces conditional diffusion models into the NPE framework for the first time, proposing a denoising diffusion probabilistic model (DDPM)-based conditional NPE method. Unlike flow-based approaches, it imposes no invertibility constraints and natively supports both summary networks and end-to-end modeling. Contribution/Results: Experiments across diverse benchmark tasks demonstrate that the proposed method significantly improves posterior estimation accuracy and training stability, achieves faster convergence, and outperforms state-of-the-art normalizing flow modelsβeven when using shallower architectures. It establishes a more robust and computationally efficient paradigm for simulation-based Bayesian inference.
π Abstract
Neural posterior estimation (NPE), a simulation-based computational approach for Bayesian inference, has shown great success in situations where posteriors are intractable or likelihood functions are treated as"black boxes."Existing NPE methods typically rely on normalizing flows, which transform a base distributions into a complex posterior by composing many simple, invertible transformations. But flow-based models, while state of the art for NPE, are known to suffer from several limitations, including training instability and sharp trade-offs between representational power and computational cost. In this work, we demonstrate the effectiveness of conditional diffusions as an alternative to normalizing flows for NPE. Conditional diffusions address many of the challenges faced by flow-based methods. Our results show that, across a highly varied suite of benchmarking problems for NPE architectures, diffusions offer improved stability, superior accuracy, and faster training times, even with simpler, shallower models. These gains persist across a variety of different encoder or"summary network"architectures, as well as in situations where no summary network is required. The code will be publicly available at url{https://github.com/TianyuCodings/cDiff}.