🤖 AI Summary
To address the challenge of synthetic data failing to approximate real-world distributions in manufacturing object detection, this paper proposes an end-to-end synthetic data generation pipeline covering object attributes, background, illumination, camera parameters, and post-processing, and introduces the SIP15-OD industrial component detection dataset. We systematically quantify, for the first time, the impact of material properties, rendering techniques, post-processing, and distractors on sim-to-real transfer. A fine-grained, multi-factor joint domain randomization strategy tailored to manufacturing scenarios is introduced. Leveraging physics-aware rendering in Blender/Unity and YOLOv8-based training, our model achieves 96.4% mAP@50 on a public robotics benchmark and 94.1%, 99.5%, and 95.3% mAP@50 across three SIP15-OD operational conditions—performance comparable to models trained exclusively on real data.
📝 Abstract
This paper addresses key aspects of domain randomization in generating synthetic data for manufacturing object detection applications. To this end, we present a comprehensive data generation pipeline that reflects different factors: object characteristics, background, illumination, camera settings, and post-processing. We also introduce the Synthetic Industrial Parts Object Detection dataset (SIP15-OD) consisting of 15 objects from three industrial use cases under varying environments as a test bed for the study, while also employing an industrial dataset publicly available for robotic applications. In our experiments, we present more abundant results and insights into the feasibility as well as challenges of sim-to-real object detection. In particular, we identified material properties, rendering methods, post-processing, and distractors as important factors. Our method, leveraging these, achieves top performance on the public dataset with Yolov8 models trained exclusively on synthetic data; mAP@50 scores of 96.4% for the robotics dataset, and 94.1%, 99.5%, and 95.3% across three of the SIP15-OD use cases, respectively. The results showcase the effectiveness of the proposed domain randomization, potentially covering the distribution close to real data for the applications.