🤖 AI Summary
This work addresses the performance degradation of multi-class anomaly detection under a large number of categories, where the complex and heterogeneous distribution of normal samples poses significant challenges. The authors propose DPDiff-AD, a novel diffusion-based model that uniquely integrates dual-scale prototype mechanisms—local and global—into the diffusion framework. Local prototypes aggregate neighboring features to capture fine-grained structural patterns, while global prototypes regularize the overall feature geometry via optimal transport constraints. A prototype-aware attention mechanism further guides the diffusion process to accurately reconstruct normality in high-dimensional, large-scale category spaces. Extensive experiments on five benchmark datasets demonstrate the method’s effectiveness, achieving state-of-the-art results with absolute gains of 5.3% and 2.9% in image-level and pixel-level AUROC, respectively, on a 160-class setting, while maintaining stable performance as the number of categories scales.
📝 Abstract
Multi-class anomaly detection aims to build unified models across diverse product categories. However, as the number of categories grows, its performance often degrades due to increasingly complex and heterogeneous normal distributions. To address this challenge, we propose DPDiff-AD, a Dual Prototype-conditioned Diffusion model for large-scale multi-class Anomaly Detection. DPDiff-AD models heterogeneous normal distributions through complementary local and global prototypes. Local prototypes capture representative fine-grained structural patterns via nearest-prototype aggregation, while global prototypes regulate holistic feature geometry through optimal transport regularization. Together, these dual-scale representations define a structured normality space. This space is refined through diffusion-based reconstruction conditioned on both local and global prototypes via prototype-aware attention. By jointly leveraging dual prototypes during generation, DPDiff-AD achieves precise normality modeling, preserves structured separability as category cardinality grows, and enables scalable anomaly discrimination. Extensive experiments across five benchmarks demonstrate the effectiveness and scalability of DPDiff-AD. On the 160-category large-scale dataset, it improves image- and pixel-level AUROC by 5.3 and 2.9 points over the previous state-of-the-art method Dinomaly+, while maintaining stable performance as category cardinality increases.