🤖 AI Summary
To address the high cost of manual LiDAR annotation and insufficient coverage of rare traffic scenarios in real-world data, this paper proposes a plug-and-play simulation-enhanced framework. Methodologically, it introduces three novel components: (1) point cloud jittering augmentation to enrich geometric diversity; (2) a domain-aware backbone network that jointly models simulation and real-world features; and (3) a memory-guided, sector-level geometric alignment mechanism to mitigate domain shift between synthetic and real LiDAR data. The framework is fully compatible with mainstream 3D detectors (e.g., CenterPoint) without requiring architectural modifications to downstream models. Evaluated on nuScenes, it achieves full-supervision performance using only 2.5% of real annotations. Moreover, it improves mAP for unseen classes by over 15 points, significantly enhancing detection robustness for long-tail and rare traffic participants.
📝 Abstract
Deep-learning-based autonomous driving (AD) perception introduces a promising picture for safe and environment-friendly transportation. However, the over-reliance on real labeled data in LiDAR perception limits the scale of on-road attempts. 3D real world data is notoriously time-and-energy-consuming to annotate and lacks corner cases like rare traffic participants. On the contrary, in simulators like CARLA, generating labeled LiDAR point clouds with corner cases is a piece of cake. However, introducing synthetic point clouds to improve real perception is non-trivial. This stems from two challenges: 1) sample efficiency of simulation datasets 2) simulation-to-real gaps. To overcome both challenges, we propose a plug-and-play method called JiSAM , shorthand for Jittering augmentation, domain-aware backbone and memory-based Sectorized AlignMent. In extensive experiments conducted on the famous AD dataset NuScenes, we demonstrate that, with SOTA 3D object detector, JiSAM is able to utilize the simulation data and only labels on 2.5% available real data to achieve comparable performance to models trained on all real data. Additionally, JiSAM achieves more than 15 mAPs on the objects not labeled in the real training set. We will release models and codes.