🤖 AI Summary
Pretraining-finetuning paradigms for metal surface defect detection suffer from two key bottlenecks: (i) domain gap arising from natural-image pretraining, and (ii) limited discriminability of fine-grained defects against complex background noise in industrial self-supervised learning. Method: We propose an anomaly-guided self-supervised pretraining framework. Its core innovation is the first use of anomaly maps as weak supervision signals, structured into a two-stage pipeline: Stage I generates high-quality anomaly maps via knowledge-enhanced synthesis; Stage II jointly leverages model distillation and pseudo-defect bounding boxes to guide detector learning. The method enables end-to-end self-supervised training on large-scale industrial imagery. Contribution/Results: It significantly improves sensitivity to small defects and localization accuracy. Experiments show up to +10% mAP@0.5 and +11.4% mAP@0.5:0.95 across multiple settings, substantially outperforming ImageNet-initialized baselines.
📝 Abstract
The pretraining-finetuning paradigm is a crucial strategy in metallic surface defect detection for mitigating the challenges posed by data scarcity. However, its implementation presents a critical dilemma. Pretraining on natural image datasets such as ImageNet, faces a significant domain gap. Meanwhile, naive self-supervised pretraining on in-domain industrial data is often ineffective due to the inability of existing learning objectives to distinguish subtle defect patterns from complex background noise and textures. To resolve this, we introduce Anomaly-Guided Self-Supervised Pretraining (AGSSP), a novel paradigm that explicitly guides representation learning through anomaly priors. AGSSP employs a two-stage framework: (1) it first pretrains the model's backbone by distilling knowledge from anomaly maps, encouraging the network to capture defect-salient features; (2) it then pretrains the detector using pseudo-defect boxes derived from these maps, aligning it with localization tasks. To enable this, we develop a knowledge-enhanced method to generate high-quality anomaly maps and collect a large-scale industrial dataset of 120,000 images. Additionally, we present two small-scale, pixel-level labeled metallic surface defect datasets for validation. Extensive experiments demonstrate that AGSSP consistently enhances performance across various settings, achieving up to a 10% improvement in mAP@0.5 and 11.4% in mAP@0.5:0.95 compared to ImageNet-based models. All code, pretrained models, and datasets are publicly available at https://clovermini.github.io/AGSSP-Dev/.