🤖 AI Summary
This work addresses the domain gap between synthetic and real LiDAR data in semantic segmentation, primarily caused by discrepancies in reflectance intensity and raydrop noise. To bridge this gap, the authors propose an unpaired Sim2Real translation framework based on diffusion models. Leveraging a diffusion model pretrained on real LiDAR scans as a generative prior, the method incorporates a raydrop-aware masking mechanism to faithfully reconstruct realistic reflectance and raydrop characteristics while preserving the structural consistency of the input synthetic point cloud. Experimental results demonstrate that the proposed approach significantly improves cross-domain semantic segmentation performance across various LiDAR representations, effectively reducing reliance on costly real-world annotations and exhibiting strong generalization capability and generation fidelity.
📝 Abstract
LiDAR-based semantic segmentation is a key component for autonomous mobile robots, yet large-scale annotation of LiDAR point clouds is prohibitively expensive and time-consuming. Although simulators can provide labeled synthetic data, models trained on synthetic data often underperform on real-world data due to a data-level domain gap. To address this issue, we propose DRUM, a novel Sim2Real translation framework. We leverage a diffusion model pre-trained on unlabeled real-world data as a generative prior and translate synthetic data by reproducing two key measurement characteristics: reflectance intensity and raydrop noise. To improve sample fidelity, we introduce a raydrop-aware masked guidance mechanism that selectively enforces consistency with the input synthetic data while preserving realistic raydrop noise induced by the diffusion prior. Experimental results demonstrate that DRUM consistently improves Sim2Real performance across multiple representations of LiDAR data. The project page is available at https://miya-tomoya.github.io/drum.