Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios

📅 2025-02-20

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Weak domain transferability from simulation to real-world data remains a critical challenge in LiDAR point cloud-based 3D object detection. To address this, this paper establishes a high-fidelity synthetic data generation pipeline built upon CARLA and conducts the first systematic analysis of how virtual LiDAR sensor parameters—including scanning pattern, noise model, and vertical resolution—affect the domain gap. We introduce fine-grained sensor modeling and domain randomization strategies to better align synthetic and real-world sensing characteristics. Our method achieves 92.3% mAP on KITTI using only 10% real annotated data for fine-tuning, slightly surpassing the fully supervised PointPillars baseline when trained on the full real dataset. Moreover, models trained exclusively on synthetic data significantly outperform existing synthetic-data-driven approaches, empirically validating that accurate modeling of sensor perception properties is essential for effective cross-domain generalization.

Technology Category

Application Category

📝 Abstract

An important factor in advancing autonomous driving systems is simulation. Yet, there is rather small progress for transferability between the virtual and real world. We revisit this problem for 3D object detection on LiDAR point clouds and propose a dataset generation pipeline based on the CARLA simulator. Utilizing domain randomization strategies and careful modeling, we are able to train an object detector on the synthetic data and demonstrate strong generalization capabilities to the KITTI dataset. Furthermore, we compare different virtual sensor variants to gather insights, which sensor attributes can be responsible for the prevalent domain gap. Finally, fine-tuning with a small portion of real data almost matches the baseline and with the full training set slightly surpasses it.

Problem

Research questions and friction points this paper is trying to address.

Enhancing synthetic data transferability to real-world scenarios

Improving 3D object detection in autonomous driving systems

Bridging the domain gap between virtual and real LiDAR data

Innovation

Methods, ideas, or system contributions that make the work stand out.

CARLA simulator for data generation

Domain randomization for model training

Fine-tuning with real data enhances performance

🔎 Similar Papers

No similar papers found.