🤖 AI Summary
To address the lack of strong pre-trained foundation models for 3D point cloud surface anomaly detection in industrial manufacturing, this paper proposes an importance-aware cross-modal ensemble network. Methodologically, it integrates a 2D vision large model (e.g., ViT) with a lightweight 3D expert model, incorporating a learnable importance-aware fusion module that dynamically weights multi-modal anomaly scores. A customized loss function is further introduced to mitigate performance degradation across modalities. The key contribution lies in achieving high-precision cross-modal collaboration without requiring any 3D pre-training—leveraging only 2D prior knowledge. Evaluated on the MVTec 3D-AD benchmark, the method achieves state-of-the-art performance with significantly reduced false positive rates, demonstrating superior reliability and practical utility in real-world industrial settings.
📝 Abstract
Surface anomaly detection is pivotal for ensuring product quality in industrial manufacturing. While 2D image-based methods have achieved remarkable success, 3D point cloud-based detection remains underexplored despite its richer geometric cues. We argue that the key bottleneck is the absence of powerful pretrained foundation backbones in 3D comparable to those in 2D. To bridge this gap, we propose Importance-Aware Ensemble Network (IAENet), an ensemble framework that synergizes 2D pretrained expert with 3D expert models. However, naively fusing predictions from disparate sources is non-trivial: existing strategies can be affected by a poorly performing modality and thus degrade overall accuracy. To address this challenge, We introduce an novel Importance-Aware Fusion (IAF) module that dynamically assesses the contribution of each source and reweights their anomaly scores. Furthermore, we devise critical loss functions that explicitly guide the optimization of IAF, enabling it to combine the collective knowledge of the source experts but also preserve their unique strengths, thereby enhancing the overall performance of anomaly detection. Extensive experiments on MVTec 3D-AD demonstrate that our IAENet achieves a new state-of-the-art with a markedly lower false positive rate, underscoring its practical value for industrial deployment.