๐ค AI Summary
Neural samplers exhibit lower target-function evaluation efficiency under unnormalized densities compared to parallel tempering (PT), while PT suffers from strong sample autocorrelation and necessitates repeated runs for each temperature schedule.
Method: We propose a progressive sampling framework that synergistically integrates diffusion models with temperature annealing.
Contribution/Results: Our key innovation is a novel sequential cross-temperature training mechanism: high-temperature diffusion models synthesize approximate low-temperature samples, which are then refined via lightweight MCMC steps. This enables cross-temperature reuse of sample information and output decorrelation. The method significantly improves target-evaluation efficiency over existing diffusion-based neural samplers, matches PT in both sampling efficiency and sample independence, and eliminates the need for repeated runsโenabling single-training, multi-temperature reusability for efficient Bayesian inference.
๐ Abstract
Recent research has focused on designing neural samplers that amortize the process of sampling from unnormalized densities. However, despite significant advancements, they still fall short of the state-of-the-art MCMC approach, Parallel Tempering (PT), when it comes to the efficiency of target evaluations. On the other hand, unlike a well-trained neural sampler, PT yields only dependent samples and needs to be rerun -- at considerable computational cost -- whenever new samples are required. To address these weaknesses, we propose the Progressive Tempering Sampler with Diffusion (PTSD), which trains diffusion models sequentially across temperatures, leveraging the advantages of PT to improve the training of neural samplers. We also introduce a novel method to combine high-temperature diffusion models to generate approximate lower-temperature samples, which are minimally refined using MCMC and used to train the next diffusion model. PTSD enables efficient reuse of sample information across temperature levels while generating well-mixed, uncorrelated samples. Our method significantly improves target evaluation efficiency, outperforming diffusion-based neural samplers.