🤖 AI Summary
Existing anomaly detection benchmarks (e.g., MVTec-AD) suffer from narrow category coverage, hindering evaluation of cross-domain generalization and scalability. To address this, we introduce ADNet—the first large-scale, multi-domain anomaly detection benchmark—comprising 380 categories across five domains, 196,294 RGB images, pixel-level annotations, and structured vision-spatial textual descriptions. Using ADNet, we empirically reveal a substantial performance degradation of mainstream methods under multi-class scaling (I-AUROC drops from 90.6% to 78.5%). To mitigate this, we propose Dinomaly-m, a context-guided Mixture-of-Experts model enabling class-adaptive modeling without increasing inference overhead. Extensive experiments demonstrate that Dinomaly-m achieves 83.2% I-AUROC and 93.1% P-AUROC on the full 380-class setting, significantly outperforming state-of-the-art methods.
📝 Abstract
Anomaly detection (AD) aims to identify defects using normal-only training data. Existing anomaly detection benchmarks (e.g., MVTec-AD with 15 categories) cover only a narrow range of categories, limiting the evaluation of cross-context generalization and scalability. We introduce ADNet, a large-scale, multi-domain benchmark comprising 380 categories aggregated from 49 publicly available datasets across Electronics, Industry, Agrifood, Infrastructure, and Medical domains. The benchmark includes a total of 196,294 RGB images, consisting of 116,192 normal samples for training and 80,102 test images, of which 60,311 are anomalous. All images are standardized with MVTec-style pixel-level annotations and structured text descriptions spanning both spatial and visual attributes, enabling multimodal anomaly detection tasks. Extensive experiments reveal a clear scalability challenge: existing state-of-the-art methods achieve 90.6% I-AUROC in one-for-one settings but drop to 78.5% when scaling to all 380 categories in a multi-class setting. To address this, we propose Dinomaly-m, a context-guided Mixture-of-Experts extension of Dinomaly that expands decoder capacity without increasing inference cost. It achieves 83.2% I-AUROC and 93.1% P-AUROC, demonstrating superior performance over existing approaches. ADNet is designed as a standardized and extensible benchmark, supporting the community in expanding anomaly detection datasets across diverse domains and providing a scalable foundation for future anomaly detection foundation models. Dataset: https://grainnet.github.io/ADNet