CMRNext: Camera to LiDAR Matching in the Wild for Localization and Extrinsic Calibration

📅 2024-01-31
🏛️ arXiv.org
📈 Citations: 6
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor generalizability of monocular camera localization and camera–LiDAR extrinsic calibration in dynamic environments—particularly across unseen sensors and scenes—this paper proposes a prior-free cross-modal matching framework. Methodologically, it pioneers modeling point-to-pixel correspondence as an optical flow estimation problem, enabling zero-shot transfer; integrates deep cross-modal feature matching, optical-flow-guided sparse correspondence generation, and geometrically constrained PnP pose estimation—eliminating reliance on retraining. Evaluated across six robotic platforms (including three public and three in-house datasets), the method achieves significantly higher accuracy in both camera localization within LiDAR maps and extrinsic parameter estimation compared to state-of-the-art approaches. It demonstrates strong cross-sensor and cross-scene generalization capability and practical deployability.

Technology Category

Application Category

📝 Abstract
LiDARs are widely used for mapping and localization in dynamic environments. However, their high cost limits their widespread adoption. On the other hand, monocular localization in LiDAR maps using inexpensive cameras is a cost-effective alternative for large-scale deployment. Nevertheless, most existing approaches struggle to generalize to new sensor setups and environments, requiring retraining or fine-tuning. In this paper, we present CMRNext, a novel approach for camera-LIDAR matching that is independent of sensor-specific parameters, generalizable, and can be used in the wild for monocular localization in LiDAR maps and camera-LiDAR extrinsic calibration. CMRNext exploits recent advances in deep neural networks for matching cross-modal data and standard geometric techniques for robust pose estimation. We reformulate the point-pixel matching problem as an optical flow estimation problem and solve the Perspective-n-Point problem based on the resulting correspondences to find the relative pose between the camera and the LiDAR point cloud. We extensively evaluate CMRNext on six different robotic platforms, including three publicly available datasets and three in-house robots. Our experimental evaluations demonstrate that CMRNext outperforms existing approaches on both tasks and effectively generalizes to previously unseen environments and sensor setups in a zero-shot manner. We make the code and pre-trained models publicly available at http://cmrnext.cs.uni-freiburg.de .
Problem

Research questions and friction points this paper is trying to address.

Camera-LiDAR matching for localization
Generalizable to new environments
Cost-effective monocular localization in LiDAR maps
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep neural cross-modal matching
Optical flow for point-pixel matching
Zero-shot generalization across environments
🔎 Similar Papers
No similar papers found.