Mentor3AD: Feature Reconstruction-based 3D Anomaly Detection via Multi-modality Mentor Learning

📅 2025-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses unsupervised 3D anomaly detection, tackling two key challenges: weak discriminability of single-modality reconstruction and difficulty in cross-modal feature alignment. We propose a novel Multimodal Mentor Learning (MML) framework. Its core innovation is the “mentor feature” mechanism—generating highly discriminative mentor representations by fusing intermediate-layer features from RGB and 3D modalities. A Multimodal Fusion Module (MFM) enables feature-level collaborative distillation, while a Mentor-Guided Module (MGM) drives cross-modal self-supervised reconstruction. Additionally, a Voting Module (VM) aggregates anomaly scores from multiple sources. Evaluated on MVTec 3D-AD and Eyecandies, MML achieves state-of-the-art performance, significantly outperforming existing single- and dual-modality reconstruction methods—particularly improving fine-grained anomaly localization accuracy.

Technology Category

Application Category

📝 Abstract
Multimodal feature reconstruction is a promising approach for 3D anomaly detection, leveraging the complementary information from dual modalities. We further advance this paradigm by utilizing multi-modal mentor learning, which fuses intermediate features to further distinguish normal from feature differences. To address these challenges, we propose a novel method called Mentor3AD, which utilizes multi-modal mentor learning. By leveraging the shared features of different modalities, Mentor3AD can extract more effective features and guide feature reconstruction, ultimately improving detection performance. Specifically, Mentor3AD includes a Mentor of Fusion Module (MFM) that merges features extracted from RGB and 3D modalities to create a mentor feature. Additionally, we have designed a Mentor of Guidance Module (MGM) to facilitate cross-modal reconstruction, supported by the mentor feature. Lastly, we introduce a Voting Module (VM) to more accurately generate the final anomaly score. Extensive comparative and ablation studies on MVTec 3D-AD and Eyecandies have verified the effectiveness of the proposed method.
Problem

Research questions and friction points this paper is trying to address.

Improving 3D anomaly detection via multi-modality feature fusion
Enhancing feature reconstruction using mentor-guided cross-modal learning
Accurately generating anomaly scores through integrated voting mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal mentor learning for feature fusion
RGB and 3D feature merging via MFM
Cross-modal reconstruction guided by MGM
🔎 Similar Papers
No similar papers found.