Commonality in Few: Few-Shot Multimodal Anomaly Detection via Hypergraph-Enhanced Memory

📅 2025-11-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of training samples and the challenge of covering diverse test distributions in few-shot multimodal industrial anomaly detection, this paper proposes a hypergraph-enhanced memory mechanism. Methodologically, it introduces high-order relational modeling to this task for the first time: a hypergraph captures structural commonalities among limited normal samples to construct an intra-class structural prior memory bank; a training-free hypergraph message-passing mechanism coupled with hyperedge-guided memory retrieval enables semantic-aware memory updating and efficient matching; and multimodal feature fusion enhances representation robustness. Evaluated on MVTec 3D-AD and Eyecandies, the method significantly outperforms existing state-of-the-art approaches—reducing false positive rate by 12.6% while substantially improving detection accuracy and cross-scenario generalization capability.

Technology Category

Application Category

📝 Abstract
Few-shot multimodal industrial anomaly detection is a critical yet underexplored task, offering the ability to quickly adapt to complex industrial scenarios. In few-shot settings, insufficient training samples often fail to cover the diverse patterns present in test samples. This challenge can be mitigated by extracting structural commonality from a small number of training samples. In this paper, we propose a novel few-shot unsupervised multimodal industrial anomaly detection method based on structural commonality, CIF (Commonality In Few). To extract intra-class structural information, we employ hypergraphs, which are capable of modeling higher-order correlations, to capture the structural commonality within training samples, and use a memory bank to store this intra-class structural prior. Firstly, we design a semantic-aware hypergraph construction module tailored for single-semantic industrial images, from which we extract common structures to guide the construction of the memory bank. Secondly, we use a training-free hypergraph message passing module to update the visual features of test samples, reducing the distribution gap between test features and features in the memory bank. We further propose a hyperedge-guided memory search module, which utilizes structural information to assist the memory search process and reduce the false positive rate. Experimental results on the MVTec 3D-AD dataset and the Eyecandies dataset show that our method outperforms the state-of-the-art (SOTA) methods in few-shot settings. Code is available at https://github.com/Sunny5250/CIF.
Problem

Research questions and friction points this paper is trying to address.

Detects industrial anomalies using few-shot multimodal data
Extracts structural commonality from limited training samples
Reduces false positives through hypergraph-enhanced memory search
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypergraphs model higher-order correlations for structural commonality
Memory bank stores intra-class structural prior information
Training-free hypergraph message passing updates test features
🔎 Similar Papers
No similar papers found.
Yuxuan Lin
Yuxuan Lin
College of Computer Science and Artificial Intelligence, Fudan University
Computer VisionMultimodal LearningEmbodied AI
H
Hanjing Yan
School of Information Science and Engineering, East China University of Science and Technology
X
Xuan Tong
College of Intelligent Robotics and Advanced Manufacturing, Fudan University
Yang Chang
Yang Chang
College of Intelligent Robotics and Advanced Manufacturing, Fudan University
H
Huanzhen Wang
College of Intelligent Robotics and Advanced Manufacturing, Fudan University
Z
Ziheng Zhou
College of Intelligent Robotics and Advanced Manufacturing, Fudan University
Shuyong Gao
Shuyong Gao
Fudan University
Human Visual AttentionGenerative ModelWeakly Supervised Learning
Y
Yan Wang
School of Data Science and Engineering, East China Normal University
W
Wenqiang Zhang
Shanghai Key Lab of Intelligent Information Processing, College of Computer Science and Artificial Intelligence, Fudan University; College of Intelligent Robotics and Advanced Manufacturing, Fudan University