🤖 AI Summary
To address the performance bottleneck of industrial visual inspection models caused by scarce defective samples, this paper proposes a defect image generation method based on fine-tuning the Stable Diffusion inpainting model. Leveraging only a small number of real defective images, it synthesizes high-fidelity, precisely localized defect samples. Key contributions include: (1) a multi-objective customized loss function integrating defect structural priors, object-level semantic constraints, and spatial attention mechanisms; and (2) an automated low-fidelity sample filtering mechanism to enhance generation quality. Evaluated on the MVTec AD dataset, the generated defect images achieve state-of-the-art fidelity and significantly improve downstream anomaly detection performance—yielding substantial gains in both AUC and PRO metrics.
📝 Abstract
Developing effective visual inspection models remains challenging due to the scarcity of defect data. While image generation models have been used to synthesize defect images, producing highly realistic defects remains difficult. We propose DefectFill, a novel method for realistic defect generation that requires only a few reference defect images. It leverages a fine-tuned inpainting diffusion model, optimized with our custom loss functions incorporating defect, object, and attention terms. It enables precise capture of detailed, localized defect features and their seamless integration into defect-free objects. Additionally, our Low-Fidelity Selection method further enhances the defect sample quality. Experiments show that DefectFill generates high-quality defect images, enabling visual inspection models to achieve state-of-the-art performance on the MVTec AD dataset.