π€ AI Summary
This work addresses the limitations of manual feature matching and poor robustness in cameraβLiDAR extrinsic calibration. We propose an end-to-end, feature-free coarse-to-fine alignment method. Our approach innovatively fuses LiDAR intensity images with monocular depth predictions to establish a dual-loss alignment framework: a structural loss based on patch-wise Pearson correlation ensures geometric consistency, while a mutual information-based texture loss enhances radiometric consistency. To incorporate strong geometric priors, we leverage a pre-trained monocular depth model and design a lightweight spatial search optimization framework. The method exhibits scene adaptability and cross-dataset generalizability. Extensive experiments on KITTI, Waymo, and MIAS-LCEC demonstrate significant improvements over state-of-the-art methods in both calibration accuracy and robustness. Code is publicly available.
π Abstract
In this paper, we unleash the potential of the powerful monodepth model in camera-LiDAR calibration and propose CLAIM, a novel method of aligning data from the camera and LiDAR. Given the initial guess and pairs of images and LiDAR point clouds, CLAIM utilizes a coarse-to-fine searching method to find the optimal transformation minimizing a patched Pearson correlation-based structure loss and a mutual information-based texture loss. These two losses serve as good metrics for camera-LiDAR alignment results and require no complicated steps of data processing, feature extraction, or feature matching like most methods, rendering our method simple and adaptive to most scenes. We validate CLAIM on public KITTI, Waymo, and MIAS-LCEC datasets, and the experimental results demonstrate its superior performance compared with the state-of-the-art methods. The code is available at https://github.com/Tompson11/claim.