🤖 AI Summary
This study addresses the challenge of training reliable supervised defect detectors in industrial visual inspection during new product introduction, where defective samples are often scarce. To overcome this limitation, the authors propose an end-to-end few-shot defect synthesis framework based on diffusion models that decouples defect morphology from background appearance. By integrating mask-guided textual inversion, noise-mixed conditional generation, and gradient-aware fusion, the method enables zero-shot cross-domain adaptation and effective data augmentation without requiring real defective samples from the target domain. Using RF-DETR as the downstream detector, the approach achieves significant performance gains on a proprietary industrial dataset: under few-shot settings, mAP improves from 78.8% to 83.3%, and in zero-shot cross-domain scenarios, mAP rises dramatically from 65.0% to 85.1%.
📝 Abstract
Industrial visual inspection systems often suffer from a severe scarcity of labeled defect data, particularly during the early stages of New Product Introduction (NPI). This limitation hinders the deployment of robust supervised detectors precisely when automated quality control is most needed. We present an end-to-end generative framework for high-fidelity, few-shot defect synthesis that enables both in-domain augmentation and cross-domain transfer. Our approach disentangles defect morphology from background appearance by combining masked textual inversion for defect representation learning, noise-blended conditioned generation for surface-aware synthesis, and gradient-aware post-processing for seamless visual integration. We evaluate the framework in two practically relevant settings: few-shot data augmentation, where synthetic samples enrich a small set of real defects, and zero-shot adaptation, where defects learned from a source domain are transferred to a novel target surface without any real target-domain defect examples. Using RF-DETR as the downstream detector, we show that the proposed pipeline substantially narrows the domain gap on a private industrial dataset. In the few-shot setting, synthetic augmentation improves mAP from 78.8% to 83.3%. In the zero-shot setting, synthetic domain adaptation improves mAP from 65.0% to 85.1%. These results demonstrate that high-fidelity defect synthesis can meaningfully accelerate NPI by enabling effective inspection models before sufficient real defect data has been collected.