Enhancing 3D LiDAR Segmentation by Shaping Dense and Accurate 2D Semantic Predictions

📅 2026-02-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of 3D LiDAR semantic segmentation, where projecting 3D points into 2D intermediate representations often results in sparse and incomplete labels, thereby limiting segmentation performance. To overcome this limitation, the authors propose MM2D3D, a method that reformulates 3D segmentation as a 2D task by leveraging camera images as an auxiliary modality. A cross-modal guided filtering module is introduced to mitigate label sparsity, while a dynamic cross pseudo-supervision mechanism enhances LiDAR-image fusion, yielding dense and accurate 2D semantic predictions. Extensive experiments demonstrate that MM2D3D significantly outperforms existing approaches in both 2D and 3D spaces, achieving superior accuracy and robustness in 3D LiDAR semantic segmentation.

Technology Category

Application Category

📝 Abstract

Semantic segmentation of 3D LiDAR point clouds is important in urban remote sensing for understanding real-world street environments. This task, by projecting LiDAR point clouds and 3D semantic labels as sparse maps, can be reformulated as a 2D problem. However, the intrinsic sparsity of the projected LiDAR and label maps can result in sparse and inaccurate intermediate 2D semantic predictions, which in return limits the final 3D accuracy. To address this issue, we enhance this task by shaping dense and accurate 2D predictions. Specifically, we develop a multi-modal segmentation model, MM2D3D. By leveraging camera images as auxiliary data, we introduce cross-modal guided filtering to overcome label map sparsity by constraining intermediate 2D semantic predictions with dense semantic relations derived from the camera images; and we introduce dynamic cross pseudo supervision to overcome LiDAR map sparsity by encouraging the 2D predictions to emulate the dense distribution of the semantic predictions from the camera images. Experiments show that our techniques enable our model to achieve intermediate 2D semantic predictions with dense distribution and higher accuracy, which effectively enhances the final 3D accuracy. Comparisons with previous methods demonstrate our superior performance in both 2D and 3D spaces.

Problem

Research questions and friction points this paper is trying to address.

3D LiDAR segmentation

semantic segmentation

sparsity

2D projection

point cloud

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-modal segmentation

cross-modal guided filtering

dynamic cross pseudo supervision

3D LiDAR segmentation

dense semantic prediction

🔎 Similar Papers

No similar papers found.

Authors to Follow