SPIRAL: Semantic-Aware Progressive LiDAR Scene Generation

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing range-view LiDAR generation methods produce only semantic-agnostic depth/reflection maps, relying on separate post-hoc segmentation models that introduce cross-modal inconsistency. To address this, we propose the first single-stage diffusion model that jointly generates depth, reflectance, and semantic range images, establishing a semantic-aware progressive range-image generation paradigm. Our method comprises: (i) a multi-channel co-generation architecture, (ii) a semantics-guided noise scheduling mechanism, and (iii) a lightweight range-view encoder-decoder. We further introduce the first dedicated evaluation metric for annotated range-view data. Experiments on SemanticKITTI and nuScenes demonstrate state-of-the-art performance with the smallest parameter count. Moreover, downstream segmentation models trained on our synthetic data achieve competitive accuracy, significantly reducing LiDAR annotation cost.

Technology Category

Application Category

📝 Abstract
Leveraging recent diffusion models, LiDAR-based large-scale 3D scene generation has achieved great success. While recent voxel-based approaches can generate both geometric structures and semantic labels, existing range-view methods are limited to producing unlabeled LiDAR scenes. Relying on pretrained segmentation models to predict the semantic maps often results in suboptimal cross-modal consistency. To address this limitation while preserving the advantages of range-view representations, such as computational efficiency and simplified network design, we propose Spiral, a novel range-view LiDAR diffusion model that simultaneously generates depth, reflectance images, and semantic maps. Furthermore, we introduce novel semantic-aware metrics to evaluate the quality of the generated labeled range-view data. Experiments on the SemanticKITTI and nuScenes datasets demonstrate that Spiral achieves state-of-the-art performance with the smallest parameter size, outperforming two-step methods that combine the generative and segmentation models. Additionally, we validate that range images generated by Spiral can be effectively used for synthetic data augmentation in the downstream segmentation training, significantly reducing the labeling effort on LiDAR data.
Problem

Research questions and friction points this paper is trying to address.

Generate labeled LiDAR scenes in range-view efficiently
Improve cross-modal consistency in semantic map generation
Enhance downstream segmentation via synthetic data augmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Range-view LiDAR diffusion model for simultaneous generation
Semantic-aware metrics for evaluating generated data quality
Efficient synthetic data augmentation for downstream tasks