🤖 AI Summary
Existing visual anomaly detection (VAD) benchmarks largely overlook the critical impact of coupled viewpoint and illumination variations on defect visibility, leading to distorted robustness evaluation. To address this, we introduce M2AD—the first large-scale multi-view, multi-illumination VAD benchmark—comprising 119,880 high-resolution images systematically covering diverse viewpoint–illumination configurations. We explicitly model and evaluate the viewpoint–illumination coupling effect, proposing two complementary evaluation protocols: Synergy (cross-configuration feature fusion) and Invariant (single-image invariance). Built upon a synchronized multi-camera array, programmable multi-source lighting, and industrial-grade sample acquisition, M2AD provides standardized annotations and a reproducible testing framework. Extensive experiments reveal substantial performance degradation of state-of-the-art VAD methods on M2AD, underscoring its importance in advancing robust, real-world VAD research.
📝 Abstract
The practical deployment of Visual Anomaly Detection (VAD) systems is hindered by their sensitivity to real-world imaging variations, particularly the complex interplay between viewpoint and illumination which drastically alters defect visibility. Current benchmarks largely overlook this critical challenge. We introduce Multi-View Multi-Illumination Anomaly Detection (M2AD), a new large-scale benchmark comprising 119,880 high-resolution images designed explicitly to probe VAD robustness under such interacting conditions. By systematically capturing 999 specimens across 10 categories using 12 synchronized views and 10 illumination settings (120 configurations total), M2AD enables rigorous evaluation. We establish two evaluation protocols: M2AD-Synergy tests the ability to fuse information across diverse configurations, and M2AD-Invariant measures single-image robustness against realistic view-illumination effects. Our extensive benchmarking shows that state-of-the-art VAD methods struggle significantly on M2AD, demonstrating the profound challenge posed by view-illumination interplay. This benchmark serves as an essential tool for developing and validating VAD methods capable of overcoming real-world complexities. Our full dataset and test suite will be released at https://hustcyq.github.io/M2AD to facilitate the field.