🤖 AI Summary
Diffusion models exhibit significant variability in sample quality, and existing methods struggle to identify low-quality generations in an unsupervised manner.
Method: We propose the first Bayesian uncertainty estimation framework grounded in semantic likelihood within the latent space—requiring no architectural modification to pretrained diffusion or flow-matching models and enabling zero-shot posterior adaptation. Our approach integrates Laplace approximation, uncertainty calibration, and efficient sampling optimization to define a semantic likelihood over the feature extractor’s latent space, enabling per-sample uncertainty quantification.
Contribution/Results: Evaluated across multiple benchmarks, our method improves low-quality sample detection by over 40% while increasing computational overhead by less than 15%, substantially outperforming existing uncertainty baselines. It provides principled, model-agnostic uncertainty estimates without retraining or fine-tuning.
📝 Abstract
Diffusion models have recently driven significant breakthroughs in generative modeling. While state-of-the-art models produce high-quality samples on average, individual samples can still be low quality. Detecting such samples without human inspection remains a challenging task. To address this, we propose a Bayesian framework for estimating generative uncertainty of synthetic samples. We outline how to make Bayesian inference practical for large, modern generative models and introduce a new semantic likelihood (evaluated in the latent space of a feature extractor) to address the challenges posed by high-dimensional sample spaces. Through our experiments, we demonstrate that the proposed generative uncertainty effectively identifies poor-quality samples and significantly outperforms existing uncertainty-based methods. Notably, our Bayesian framework can be applied post-hoc to any pretrained diffusion or flow matching model (via the Laplace approximation), and we propose simple yet effective techniques to minimize its computational overhead during sampling.