Multimodal Industrial Anomaly Detection by Crossmodal Reverse Distillation

📅 2024-12-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing knowledge distillation (KD)-based multimodal industrial anomaly detection methods suffer from two key limitations: (1) cross-modal feature fusion tends to obscure unimodal local anomalies, leading to missed detections; and (2) insufficient modeling of intra-modal fine-grained structural patterns and inter-modal complementary relationships. To address these issues, we propose Cross-modal Reverse Distillation (CRD), a novel framework featuring “multi-branch independent modeling + cross-modal filtering-and-amplification.” Each modality is equipped with a dedicated student network for fine-grained anomaly localization, while learnable cross-modal mappings—combined with a filtering-and-amplification module—explicitly enhance inter-modal collaborative supervision. Evaluated on MVTec 3D-AD, CRD achieves state-of-the-art performance in unsupervised multimodal anomaly detection and pixel-level localization, demonstrating superior capability in preserving modality-specific anomalies and leveraging cross-modal complementarity.

Technology Category

Application Category

📝 Abstract
Knowledge distillation (KD) has been widely studied in unsupervised Industrial Image Anomaly Detection (AD), but its application to unsupervised multimodal AD remains underexplored. Existing KD-based methods for multimodal AD that use fused multimodal features to obtain teacher representations face challenges. Anomalies in one modality may not be effectively captured in the fused teacher features, leading to detection failures. Besides, these methods do not fully leverage the rich intra- and inter-modality information. In this paper, we propose Crossmodal Reverse Distillation (CRD) based on Multi-branch design to realize Multimodal Industrial AD. By assigning independent branches to each modality, our method enables finer detection of anomalies within each modality. Furthermore, we enhance the interaction between modalities during the distillation process by designing Crossmodal Filter and Amplifier. With the idea of crossmodal mapping, the student network is allowed to better learn normal features while anomalies in all modalities are ensured to be effectively detected. Experimental verifications on the MVTec 3D-AD dataset demonstrate that our method achieves state-of-the-art performance in multimodal anomaly detection and localization.
Problem

Research questions and friction points this paper is trying to address.

Addresses challenges in unsupervised multimodal anomaly detection.
Improves anomaly detection by leveraging intra- and inter-modality information.
Proposes Crossmodal Reverse Distillation for effective anomaly localization.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Crossmodal Reverse Distillation for anomaly detection
Multi-branch design per modality for finer detection
Crossmodal Filter and Amplifier enhance modality interaction
🔎 Similar Papers
No similar papers found.