🤖 AI Summary
Traditional reconstruction-based diffusion models for anomaly detection require multiple denoising sampling steps and careful tuning of noise levels, resulting in high computational overhead and slow inference. This paper proposes an end-to-end, reconstruction-free anomaly discrimination paradigm: it directly leverages the probability density of latent variables—under the Gaussian prior—in the pretrained diffusion model’s latent space as the anomaly score, enabling reliable detection within only 2–5 short diffusion steps. By bypassing explicit image reconstruction, the method eliminates reconstruction error computation and mitigates sensitivity to hyperparameters, thereby significantly improving both efficiency and robustness. Evaluated on MVTec-AD, it achieves a state-of-the-art AUC of 0.991 while sustaining real-time inference at 15 FPS. To our knowledge, this is the first approach to achieve millisecond-level industrial-grade real-time anomaly detection without compromising SOTA accuracy, thus redefining the speed–accuracy trade-off frontier in anomaly detection.
📝 Abstract
Diffusion models, with their robust distribution approximation capabilities, have demonstrated excellent performance in anomaly detection. However, conventional reconstruction-based approaches rely on computing the reconstruction error between the original and denoised images, which requires careful noise-strength tuning and over ten network evaluations per input-leading to significantly slower detection speeds. To address these limitations, we propose a novel diffusion-based anomaly detection method that circumvents the need for resource-intensive reconstruction. Instead of reconstructing the input image, we directly infer its corresponding latent variables and measure their density under the Gaussian prior distribution. Remarkably, the prior density proves effective as an anomaly score even when using a short partial diffusion process of only 2-5 steps. We evaluate our method on the MVTecAD dataset, achieving an AUC of 0.991 at 15 FPS, thereby setting a new state-of-the-art speed-AUC anomaly detection trade-off.