🤖 AI Summary
This work addresses the challenge of industrial visual anomaly detection with only a few normal samples by proposing the first training-free, prompt-tuning-free, and external memory-free few-shot method. Leveraging frozen DINOv2 features extracted from image patches, the approach constructs a low-dimensional subspace of normal patterns via principal component analysis (PCA) and uses the reconstruction residual within this subspace as the anomaly criterion. The method is simple, interpretable, and relies entirely on pretrained foundation models without any additional training. It achieves state-of-the-art performance in few-shot anomaly detection, reporting image- and pixel-level AUROC scores of 98.0%/97.6% on MVTec-AD and 93.3%/98.3% on VisA, respectively.
📝 Abstract
Detecting visual anomalies in industrial inspection often requires training with only a few normal images per category. Recent few-shot methods achieve strong results employing foundation-model features, but typically rely on memory banks, auxiliary datasets, or multi-modal tuning of vision-language models. We therefore question whether such complexity is necessary given the feature representations of vision foundation models. To answer this question, we introduce SubspaceAD, a training-free method, that operates in two simple stages. First, patch-level features are extracted from a small set of normal images by a frozen DINOv2 backbone. Second, a Principal Component Analysis (PCA) model is fit to these features to estimate the low-dimensional subspace of normal variations. At inference, anomalies are detected via the reconstruction residual with respect to this subspace, producing interpretable and statistically grounded anomaly scores. Despite its simplicity, SubspaceAD achieves state-of-the-art performance across one-shot and few-shot settings without training, prompt tuning, or memory banks. In the one-shot anomaly detection setting, SubspaceAD achieves image-level and pixel-level AUROC of 98.0% and 97.6% on the MVTec-AD dataset, and 93.3% and 98.3% on the VisA dataset, respectively, surpassing prior state-of-the-art results. Code and demo are available at https://github.com/CLendering/SubspaceAD.