INP-Former++: Advancing Universal Anomaly Detection via Intrinsic Normal Prototypes and Residual Learning

๐Ÿ“… 2025-06-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Anomaly detection typically relies on normal samples from a training set as reference, yet appearance and spatial variations hinder cross-image alignment, limiting generalization. This paper proposes a novel zero-shot universal anomaly detection paradigm: without any training samples, it self-extracts intrinsic normal prototypes (INPs) solely from a single test image and leverages INPs to guide a decoder for reconstructing normal regions, using reconstruction residuals as anomaly scores. We innovatively introduce an INP consistency loss and a soft mining loss, integrating softened consistency constraints with residual learning to enable adaptive performance across all settingsโ€”zero-shot, few-shot, semi-supervised, one-class, and multi-class. Built upon a Transformer architecture, our method employs self-attention, linear prototype composition, and INP-guided reconstruction. It achieves new state-of-the-art results on MVTec-AD, VisA, and Real-IAD, demonstrating both high localization accuracy and strong generalization capability.

Technology Category

Application Category

๐Ÿ“ Abstract
Anomaly detection (AD) is essential for industrial inspection and medical diagnosis, yet existing methods typically rely on ``comparing'' test images to normal references from a training set. However, variations in appearance and positioning often complicate the alignment of these references with the test image, limiting detection accuracy. We observe that most anomalies manifest as local variations, meaning that even within anomalous images, valuable normal information remains. We argue that this information is useful and may be more aligned with the anomalies since both the anomalies and the normal information originate from the same image. Therefore, rather than relying on external normality from the training set, we propose INP-Former, a novel method that extracts Intrinsic Normal Prototypes (INPs) directly from the test image. Specifically, we introduce the INP Extractor, which linearly combines normal tokens to represent INPs. We further propose an INP Coherence Loss to ensure INPs can faithfully represent normality for the testing image. These INPs then guide the INP-guided Decoder to reconstruct only normal tokens, with reconstruction errors serving as anomaly scores. Additionally, we propose a Soft Mining Loss to prioritize hard-to-optimize samples during training. INP-Former achieves state-of-the-art performance in single-class, multi-class, and few-shot AD tasks across MVTec-AD, VisA, and Real-IAD, positioning it as a versatile and universal solution for AD. Remarkably, INP-Former also demonstrates some zero-shot AD capability. Furthermore, we propose a soft version of the INP Coherence Loss and enhance INP-Former by incorporating residual learning, leading to the development of INP-Former++. The proposed method significantly improves detection performance across single-class, multi-class, semi-supervised, few-shot, and zero-shot settings.
Problem

Research questions and friction points this paper is trying to address.

Detects anomalies via intrinsic normal prototypes from test images
Improves alignment of normal references with anomalies in same image
Enhances detection accuracy across diverse anomaly detection tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts Intrinsic Normal Prototypes from test images
Uses INP Coherence Loss for faithful normality representation
Incorporates residual learning to enhance detection performance
๐Ÿ”Ž Similar Papers
2024-05-29arXiv.orgCitations: 0
W
Wei Luo
State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China
Haiming Yao
Haiming Yao
Tsinghua University
Anomaly DetectionMulti-Task LearningAI for ScienceFine-tuning
Yunkang Cao
Yunkang Cao
Hunan University
Visual Anomaly DetectionIndustrial Foundation ModelEmbodied Intelligence
Qiyu Chen
Qiyu Chen
Institute of Automation, Chinese Academy of Sciences
Anomaly DetectionComputer VisionDeep Learning
A
Ang Gao
State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China
Weiming Shen
Weiming Shen
Huazhong University of Science and Technology
Weihang Zhang
Weihang Zhang
Assistant Professor, School of Medical Technology, Beijing Institute of Technology
medical image processing
W
Wenyong Yu
State Key Laboratory of Intelligent Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China