Mentor3AD: Feature Reconstruction-based 3D Anomaly Detection via Multi-modality Mentor Learning

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses unsupervised 3D anomaly detection, tackling two key challenges: weak discriminability of single-modality reconstruction and difficulty in cross-modal feature alignment. We propose a novel Multimodal Mentor Learning (MML) framework. Its core innovation is the “mentor feature” mechanism—generating highly discriminative mentor representations by fusing intermediate-layer features from RGB and 3D modalities. A Multimodal Fusion Module (MFM) enables feature-level collaborative distillation, while a Mentor-Guided Module (MGM) drives cross-modal self-supervised reconstruction. Additionally, a Voting Module (VM) aggregates anomaly scores from multiple sources. Evaluated on MVTec 3D-AD and Eyecandies, MML achieves state-of-the-art performance, significantly outperforming existing single- and dual-modality reconstruction methods—particularly improving fine-grained anomaly localization accuracy.

Technology Category

Application Category

📝 Abstract

Multimodal feature reconstruction is a promising approach for 3D anomaly detection, leveraging the complementary information from dual modalities. We further advance this paradigm by utilizing multi-modal mentor learning, which fuses intermediate features to further distinguish normal from feature differences. To address these challenges, we propose a novel method called Mentor3AD, which utilizes multi-modal mentor learning. By leveraging the shared features of different modalities, Mentor3AD can extract more effective features and guide feature reconstruction, ultimately improving detection performance. Specifically, Mentor3AD includes a Mentor of Fusion Module (MFM) that merges features extracted from RGB and 3D modalities to create a mentor feature. Additionally, we have designed a Mentor of Guidance Module (MGM) to facilitate cross-modal reconstruction, supported by the mentor feature. Lastly, we introduce a Voting Module (VM) to more accurately generate the final anomaly score. Extensive comparative and ablation studies on MVTec 3D-AD and Eyecandies have verified the effectiveness of the proposed method.

Problem

Research questions and friction points this paper is trying to address.

Improving 3D anomaly detection via multi-modality feature fusion

Enhancing feature reconstruction using mentor-guided cross-modal learning

Accurately generating anomaly scores through integrated voting mechanisms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal mentor learning for feature fusion

RGB and 3D feature merging via MFM

Cross-modal reconstruction guided by MGM

🔎 Similar Papers

No similar papers found.