🤖 AI Summary
To address the scarcity of real annotated data in robotic vision—coupled with the time-consuming, error-prone nature of manual annotation and limited scene diversity—this paper proposes a synthetic data generation framework leveraging Unreal Engine and 3D Gaussian Splatting (3DGS). It is the first work to integrate 3DGS into high-dynamic robotic scenarios (e.g., robot soccer), enabling fully automatic, pixel-accurate, photorealistic image synthesis. When used independently to train YOLO-style detectors, the synthetic data achieves performance on par with models trained on real annotated data. Furthermore, adopting a synthetic–real hybrid training strategy significantly enhances generalization, yielding a +12.7% mAP improvement on real-world benchmarks. Overall, the method improves annotation efficiency by over 90%, establishing a novel, efficient, and reliable paradigm for data generation in resource-constrained robotic vision tasks.
📝 Abstract
Annotated datasets are critical for training neural networks for object detection, yet their manual creation is time- and labour-intensive, subjective to human error, and often limited in diversity. This challenge is particularly pronounced in the domain of robotics, where diverse and dynamic scenarios further complicate the creation of representative datasets. To address this, we propose a novel method for automatically generating annotated synthetic data in Unreal Engine. Our approach leverages photorealistic 3D Gaussian splats for rapid synthetic data generation. We demonstrate that synthetic datasets can achieve performance comparable to that of real-world datasets while significantly reducing the time required to generate and annotate data. Additionally, combining real-world and synthetic data significantly increases object detection performance by leveraging the quality of real-world images with the easier scalability of synthetic data. To our knowledge, this is the first application of synthetic data for training object detection algorithms in the highly dynamic and varied environment of robot soccer. Validation experiments reveal that a detector trained on synthetic images performs on par with one trained on manually annotated real-world images when tested on robot soccer match scenarios. Our method offers a scalable and comprehensive alternative to traditional dataset creation, eliminating the labour-intensive error-prone manual annotation process. By generating datasets in a simulator where all elements are intrinsically known, we ensure accurate annotations while significantly reducing manual effort, which makes it particularly valuable for robotics applications requiring diverse and scalable training data.