🤖 AI Summary
This work addresses the challenges of industrial surface defect detection, which are often hindered by scarce defect samples, long-tailed distributions, and the difficulty of localizing subtle anomalies against complex backgrounds. The authors propose an unsupervised method that innovatively integrates a denoising diffusion probabilistic model (DDPM) with an asymmetric teacher–student network. By training the DDPM exclusively on normal samples, the approach generates high-fidelity synthetic defect images along with their pixel-level annotations. Anomaly regions are further emphasized through a dual-stream teacher–student architecture, and the model is jointly optimized using Perlin noise masks, cosine similarity loss, and pixel-wise segmentation supervision. Evaluated on the MVTec AD dataset, the method achieves 98.4% image-level and 98.3% pixel-level AUROC, substantially outperforming existing unsupervised and mainstream deep learning approaches.
📝 Abstract
Industrial surface defect detection often suffers from limited defect samples, severe long-tailed distributions, and difficulties in accurately localizing subtle defects under complex backgrounds. To address these challenges, this paper proposes an unsupervised defect detection method that integrates a Denoising Diffusion Probabilistic Model (DDPM) with an asymmetric teacher-student architecture. First, at the data level, the DDPM is trained solely on normal samples. By introducing constant-variance Gaussian perturbations and Perlin noise-based masks, high-fidelity and physically consistent defect samples along with pixel-level annotations are generated, effectively alleviating the data scarcity problem. Second, at the model level, an asymmetric dual-stream network is constructed. The teacher network provides stable representations of normal features, while the student network reconstructs normal patterns and amplifies discrepancies between normal and anomalous regions. Finally, a joint optimization strategy combining cosine similarity loss and pixel-wise segmentation supervision is adopted to achieve precise localization of subtle defects. Experimental results on the MVTecAD dataset show that the proposed method achieves 98.4\% image-level AUROC and 98.3\% pixel-level AUROC, significantly outperforming existing unsupervised and mainstream deep learning methods. The proposed approach does not require large amounts of real defect samples and enables accurate and robust industrial defect detection and localization.
\keywords{Industrial defect detection \and diffusion models \and data generation \and teacher-student architecture \and pixel-level localization}