Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection

πŸ“… 2025-03-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing anomaly detection methods rely on external normal samples for alignment, making them vulnerable to appearance and positional discrepancies. To address this, we propose INP-Formerβ€”the first framework to enable **intra-image normal prototype (INP) modeling**: it extracts intrinsic normal representations solely from the test image itself, eliminating dependence on external training data. Built upon a vision Transformer architecture, our method introduces an INP extractor, an INP-guided decoder, and an INP consistency loss, augmented by soft hard-example mining to enhance robustness. The framework supports zero-shot transfer and achieves state-of-the-art performance across MVTec-AD, VisA, and Real-IAD benchmarks. Notably, it unifies single-class, multi-class, and few-shot anomaly detection under a single paradigm, significantly improving both detection accuracy and generalization capability.

Technology Category

Application Category

πŸ“ Abstract
Anomaly detection (AD) is essential for industrial inspection, yet existing methods typically rely on ``comparing'' test images to normal references from a training set. However, variations in appearance and positioning often complicate the alignment of these references with the test image, limiting detection accuracy. We observe that most anomalies manifest as local variations, meaning that even within anomalous images, valuable normal information remains. We argue that this information is useful and may be more aligned with the anomalies since both the anomalies and the normal information originate from the same image. Therefore, rather than relying on external normality from the training set, we propose INP-Former, a novel method that extracts Intrinsic Normal Prototypes (INPs) directly from the test image. Specifically, we introduce the INP Extractor, which linearly combines normal tokens to represent INPs. We further propose an INP Coherence Loss to ensure INPs can faithfully represent normality for the testing image. These INPs then guide the INP-Guided Decoder to reconstruct only normal tokens, with reconstruction errors serving as anomaly scores. Additionally, we propose a Soft Mining Loss to prioritize hard-to-optimize samples during training. INP-Former achieves state-of-the-art performance in single-class, multi-class, and few-shot AD tasks across MVTec-AD, VisA, and Real-IAD, positioning it as a versatile and universal solution for AD. Remarkably, INP-Former also demonstrates some zero-shot AD capability. Code is available at:https://github.com/luow23/INP-Former.
Problem

Research questions and friction points this paper is trying to address.

Detects anomalies by extracting normal prototypes from test images.
Improves accuracy by using intrinsic normal information within images.
Achieves state-of-the-art performance in various anomaly detection tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts Intrinsic Normal Prototypes from test images
Uses INP Coherence Loss for accurate normality representation
Implements Soft Mining Loss to prioritize challenging samples
πŸ”Ž Similar Papers
No similar papers found.