🤖 AI Summary
To address over-generalization in knowledge distillation for image anomaly detection—caused by similarity between input and supervision signals—this paper proposes Masked Reverse Knowledge Distillation (MRKD). MRKD reformulates the conventional reconstruction task as a masked inpainting task, decoupling input and supervision to mitigate generalization bias. It introduces a synergistic mechanism combining Image-Level Masking (ILM) and synthetically generated Feature-Level Masking (FLM), jointly modeling global context and local anomaly details. As the first work to establish a “reverse distillation” paradigm, MRKD integrates self-supervised reconstruction, feature perturbation augmentation, and masked inpainting optimization. On the MVTec AD dataset, it achieves state-of-the-art performance: 98.9% image-level AU-ROC, 98.4% pixel-level AU-ROC, and 95.3% AU-PRO. Ablation studies confirm its effectiveness in suppressing over-generalization.
📝 Abstract
Knowledge distillation is an effective image anomaly detection and localization scheme. However, a major drawback of this scheme is its tendency to overly generalize, primarily due to the similarities between input and supervisory signals. In order to address this issue, this paper introduces a novel technique called masked reverse knowledge distillation (MRKD). By employing image-level masking (ILM) and feature-level masking (FLM), MRKD transforms the task of image reconstruction into image restoration. Specifically, ILM helps to capture global information by differentiating input signals from supervisory signals. On the other hand, FLM incorporates synthetic feature-level anomalies to ensure that the learned representations contain sufficient local information. With these two strategies, MRKD is endowed with stronger image context capture capacity and is less likely to be overgeneralized. Experiments on the widely-used MVTec anomaly detection dataset demonstrate that MRKD achieves impressive performance: image-level 98.9% AU-ROC, pixel-level 98.4% AU-ROC, and 95.3% AU-PRO. In addition, extensive ablation experiments have validated the superiority of MRKD in mitigating the overgeneralization problem.