🤖 AI Summary
To address the scarcity of real anomalous samples in industrial 3D anomaly detection, this paper proposes a self-supervised, 3D-RGB dual-modal collaborative anomaly synthesis method. It injects anomalies into 3D point clouds or voxels while enforcing cross-modal feature alignment between RGB and 3D representations to generate semantically consistent dual-modal anomalous samples. Furthermore, we design a joint reconstruction-discrimination learning framework incorporating a dual-modal adversarial discriminator and an augmentation dropout strategy, enabling fused discrimination of original and reconstructed embeddings alongside pixel-level anomaly localization. To the best of our knowledge, this is the first work to introduce a dual-modal collaborative synthesis mechanism for 3D anomaly detection. Our method achieves state-of-the-art detection accuracy on MVTec 3D-AD and Eyescandies, while attaining competitive segmentation performance.
📝 Abstract
Synthesizing anomaly samples has proven to be an effective strategy for self-supervised 2D industrial anomaly detection. However, this approach has been rarely explored in multi-modality anomaly detection, particularly involving 3D and RGB images. In this paper, we propose a novel dual-modality augmentation method for 3D anomaly synthesis, which is simple and capable of mimicking the characteristics of 3D defects. Incorporating with our anomaly synthesis method, we introduce a reconstruction-based discriminative anomaly detection network, in which a dual-modal discriminator is employed to fuse the original and reconstructed embedding of two modalities for anomaly detection. Additionally, we design an augmentation dropout mechanism to enhance the generalizability of the discriminator. Extensive experiments show that our method outperforms the state-of-the-art methods on detection precision and achieves competitive segmentation performance on both MVTec 3D-AD and Eyescandies datasets.