Boosting LiDAR-Based Localization with Semantic Insight: Camera Projection versus Direct LiDAR Segmentation

📅 2025-09-24

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

To address the low accuracy and poor robustness of LiDAR-based localization in multi-sensor configurations, this paper proposes a camera-semantic-guided LiDAR localization enhancement method. It projects LiDAR point clouds onto the camera semantic segmentation space, leverages Depth-Anything to generate high-confidence image-level semantic maps, and integrates an adaptive LiDAR segmentation network to achieve cross-modal semantic alignment and joint optimization. Ground-truth supervision is provided by GNSS RTK, enabling an end-to-end multimodal fusion localization framework. The approach effectively mitigates limitations of pure LiDAR semantic segmentation in complex urban environments, significantly improving localization adaptability and reliability. Evaluated on a 55-km real-world vehicle test route in Karlsruhe, Germany—encompassing urban streets, multi-lane roads, and rural highways—the method reduces average localization error by 32.7% over baseline methods, demonstrating strong effectiveness and generalization capability in realistic, challenging scenarios.

Technology Category

Application Category

📝 Abstract

Semantic segmentation of LiDAR data presents considerable challenges, particularly when dealing with diverse sensor types and configurations. However, incorporating semantic information can significantly enhance the accuracy and robustness of LiDAR-based localization techniques for autonomous mobile systems. We propose an approach that integrates semantic camera data with LiDAR segmentation to address this challenge. By projecting LiDAR points into the semantic segmentation space of the camera, our method enhances the precision and reliability of the LiDAR-based localization pipeline. For validation, we utilize the CoCar NextGen platform from the FZI Research Center for Information Technology, which offers diverse sensor modalities and configurations. The sensor setup of CoCar NextGen enables a thorough analysis of different sensor types. Our evaluation leverages the state-of-the-art Depth-Anything network for camera image segmentation and an adaptive segmentation network for LiDAR segmentation. To establish a reliable ground truth for LiDAR-based localization, we make us of a Global Navigation Satellite System (GNSS) solution with Real-Time Kinematic corrections (RTK). Additionally, we conduct an extensive 55 km drive through the city of Karlsruhe, Germany, covering a variety of environments, including urban areas, multi-lane roads, and rural highways. This multimodal approach paves the way for more reliable and precise autonomous navigation systems, particularly in complex real-world environments.

Problem

Research questions and friction points this paper is trying to address.

Improving LiDAR-based localization accuracy using semantic information

Integrating camera semantic data with LiDAR segmentation techniques

Addressing challenges of diverse sensor types and configurations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Projecting LiDAR points into camera semantic space

Integrating semantic camera data with LiDAR segmentation

Using adaptive segmentation network for LiDAR data

🔎 Similar Papers

No similar papers found.