🤖 AI Summary
In unsupervised image anomaly detection, existing reconstruction-based methods suffer from excessive reconstruction fidelity in anomalous regions due to strong model fitting capacity, leading to high false-negative rates. To address this, we propose an Attention-Guided Perturbation (AGP) mechanism: (i) for the first time, foreground-aware attention masks are embedded into the perturbation generation process to enable semantic-aware, adaptive noise injection into critical regions; and (ii) a dual-branch collaborative architecture is designed to jointly optimize reconstruction and perturbation objectives, thereby enhancing model sensitivity to anomalies. By integrating multi-scale features with spatially adaptive perturbations, AGP achieves state-of-the-art detection performance across few-shot, one-class, and multi-class settings on the MVTec-AD, VisA, and MVTec-3D benchmarks.
📝 Abstract
Reconstruction-based methods have significantly advanced unsupervised image anomaly detection involving only normal training images. However, it has been proven that modern neural networks generally have a strong reconstruction capacity and often reconstruct both normal and abnormal samples well, thereby failing to spot anomaly regions by checking the reconstruction quality. To prevent well-reconstructed anomalies, one simple but effective strategy is to perturb normal samples and then map perturbed versions to normal ones. Yet it treats each spatial position equally, disregarding the fact that the foreground locations are inherently more important for reconstruction. Motivated by this, we present a simple yet effective reconstruction framework named Attention-Guided Perturbation Network (AGPNet), which learns to add perturbations guided with an attention mask during training. Specifically, it consists of two branches, ie, a reconstruction branch and an auxiliary attention-based perturbation branch. The reconstruction branch learns to reconstruct normal samples, while the auxiliary one aims to produce attention masks to guide the noise perturbation process for normal samples. By doing so, we are expecting to synthesize hard yet more informative anomalies for training, which enable the reconstruction branch to learn important inherent normal patterns both comprehensively and efficiently. Extensive experiments are conducted on several popular benchmarks covering MVTec-AD, VisA, and MVTec-3D, and show that AGPNet obtains leading anomaly detection results under few-shot, one-class, and multi-class setups.