Normality Prior Guided Multi-Semantic Fusion Network for Unsupervised Image Anomaly Detection

📅 2025-06-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of detecting logical anomalies—locally normal yet globally semantically anomalous regions—in unsupervised image anomaly detection, this paper proposes a normality-prior-guided multi-semantic fusion network. The method leverages a pre-trained vision-language model to extract abstract, global semantic representations from normal samples and constructs a learnable vector-quantized semantic codebook to enable hierarchical semantic fusion and guidance. Crucially, the semantic prior replaces raw compressed features at the encoder-decoder bottleneck, explicitly constraining reconstruction toward the normal data distribution and thereby suppressing misleading reconstructions of logical anomalies. Evaluated on the MVTec LOCO AD benchmark, the approach achieves state-of-the-art performance: +5.7% in pixel-level sPRO and +2.6% in image-level AUROC. The core contributions lie in (i) the integration of learnable semantic priors via vector quantization, and (ii) the explicit semantic regularization of reconstruction to enhance logical anomaly discrimination.

Technology Category

Application Category

📝 Abstract
Recently, detecting logical anomalies is becoming a more challenging task compared to detecting structural ones. Existing encoder decoder based methods typically compress inputs into low-dimensional bottlenecks on the assumption that the compression process can effectively suppress the transmission of logical anomalies to the decoder. However, logical anomalies present a particular difficulty because, while their local features often resemble normal semantics, their global semantics deviate significantly from normal patterns. Thanks to the generalisation capabilities inherent in neural networks, these abnormal semantic features can propagate through low-dimensional bottlenecks. This ultimately allows the decoder to reconstruct anomalous images with misleading fidelity. To tackle the above challenge, we propose a novel normality prior guided multi-semantic fusion network for unsupervised anomaly detection. Instead of feeding the compressed bottlenecks to the decoder directly, we introduce the multi-semantic features of normal samples into the reconstruction process. To this end, we first extract abstract global semantics of normal cases by a pre-trained vision-language network, then the learnable semantic codebooks are constructed to store representative feature vectors of normal samples by vector quantisation. Finally, the above multi-semantic features are fused and employed as input to the decoder to guide the reconstruction of anomalies to approximate normality. Extensive experiments are conducted to validate the effectiveness of our proposed method, and it achieves the SOTA performance on the MVTec LOCO AD dataset with improvements of 5.7% in pixel-sPRO and 2.6% in image-AUROC. The source code is available at https://github.com/Xmh-L/NPGMF.
Problem

Research questions and friction points this paper is trying to address.

Detecting logical anomalies in images is challenging
Existing methods fail to suppress abnormal semantic propagation
Proposing a multi-semantic fusion network for anomaly detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Normality prior guided multi-semantic fusion network
Pre-trained vision-language network extracts global semantics
Learnable semantic codebooks store normal feature vectors
🔎 Similar Papers
No similar papers found.
Muhao Xu
Muhao Xu
PhD ShanDong university
X
Xueying Zhou
Shandong Key Laboratory of Ubiquitous Intelligent Computing, University of Jinan, Jinan 250022, China
X
Xizhan Gao
Shandong Key Laboratory of Ubiquitous Intelligent Computing, University of Jinan, Jinan 250022, China
Weiye Song
Weiye Song
Post Doctoral Fellow,Harvard Medical School,Massachusetts General Hospital Wellman Center
Guang Feng
Guang Feng
University of Jinan
deep learningreferring image segmentationsaliency detection
Sijie Niu
Sijie Niu
University of Jinan
Medical Image ComputingPattern Recognition