🤖 AI Summary
To address the scarcity of training samples and the challenge of covering diverse test distributions in few-shot multimodal industrial anomaly detection, this paper proposes a hypergraph-enhanced memory mechanism. Methodologically, it introduces high-order relational modeling to this task for the first time: a hypergraph captures structural commonalities among limited normal samples to construct an intra-class structural prior memory bank; a training-free hypergraph message-passing mechanism coupled with hyperedge-guided memory retrieval enables semantic-aware memory updating and efficient matching; and multimodal feature fusion enhances representation robustness. Evaluated on MVTec 3D-AD and Eyecandies, the method significantly outperforms existing state-of-the-art approaches—reducing false positive rate by 12.6% while substantially improving detection accuracy and cross-scenario generalization capability.
📝 Abstract
Few-shot multimodal industrial anomaly detection is a critical yet underexplored task, offering the ability to quickly adapt to complex industrial scenarios. In few-shot settings, insufficient training samples often fail to cover the diverse patterns present in test samples. This challenge can be mitigated by extracting structural commonality from a small number of training samples. In this paper, we propose a novel few-shot unsupervised multimodal industrial anomaly detection method based on structural commonality, CIF (Commonality In Few). To extract intra-class structural information, we employ hypergraphs, which are capable of modeling higher-order correlations, to capture the structural commonality within training samples, and use a memory bank to store this intra-class structural prior. Firstly, we design a semantic-aware hypergraph construction module tailored for single-semantic industrial images, from which we extract common structures to guide the construction of the memory bank. Secondly, we use a training-free hypergraph message passing module to update the visual features of test samples, reducing the distribution gap between test features and features in the memory bank. We further propose a hyperedge-guided memory search module, which utilizes structural information to assist the memory search process and reduce the false positive rate. Experimental results on the MVTec 3D-AD dataset and the Eyecandies dataset show that our method outperforms the state-of-the-art (SOTA) methods in few-shot settings. Code is available at https://github.com/Sunny5250/CIF.