🤖 AI Summary
This work addresses the failure mechanisms of diffusion posterior samplers in imaging inverse problems, which often suffer from posterior distortion, biased mode weights, and hallucinations due to inaccurate likelihood approximations at intermediate timesteps—a phenomenon lacking systematic understanding. The authors propose a general posterior sampling framework grounded in a finite-sample perspective, capable of approximating the true posterior with arbitrary accuracy as the training set size tends to infinity, applicable to arbitrary forward models and priors. They establish, for the first time, that posterior errors can arise solely from multimodal priors combined with inaccurate divergence estimation at intermediate steps, without requiring nonlinear measurements or multimodal posteriors. A diagnostic tool independent of likelihood approximation and forward model is introduced, revealing that mainstream methods commonly underestimate or overestimate divergences, leading to premature stopping sensitivity and spurious modes; the proposed framework seamlessly integrates into existing pipelines for error diagnosis and correction.
📝 Abstract
Diffusion models have excellent capacity to model complex distributions of natural data, which has made them a popular and effective choice for posterior sampling in imaging inverse problems. Existing methods can incorporate any measurement model at inference time but must use an inexact approximation for the likelihood at intermediate timesteps for computational tractability. Although these approximations can often work well empirically, their downstream effect on the sampled posterior is poorly understood and can result in unexplained failures. To understand when, why, and how these likelihood approximations propagate to erroneous posterior distributions, we introduce a finite-sample perspective on posterior sampling that approximates the posterior to arbitrary precision as training set size tends towards infinity, for any forward model and prior distribution. Using this finite-sample lens, we observe that popular posterior sampling approximations tend to under- or over-estimate the spread of the posterior at intermediate timesteps, causing downstream consequences including sensitivity to early stopping time, inaccurate relative weighting of posterior modes, and hallucination, both of prior modes that are not in the posterior and likelihood modes that are not supported by the prior. Moreover, we find that the cause of these posterior errors requires neither a nonlinear measurement model nor a multimodal posterior, but can arise solely due to a multimodal prior and inaccurate posterior spread at intermediate sampling times. Our finite-sample posterior sampling approach is agnostic to the type of likelihood approximation and the type of (linear or nonlinear) forward model, and can thus serve as a drop-in diagnostic to evaluate the accuracy and failure modes of existing and future posterior samplers.