CLAIM: Camera-LiDAR Alignment with Intensity and Monodepth

πŸ“… 2025-12-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of manual feature matching and poor robustness in camera–LiDAR extrinsic calibration. We propose an end-to-end, feature-free coarse-to-fine alignment method. Our approach innovatively fuses LiDAR intensity images with monocular depth predictions to establish a dual-loss alignment framework: a structural loss based on patch-wise Pearson correlation ensures geometric consistency, while a mutual information-based texture loss enhances radiometric consistency. To incorporate strong geometric priors, we leverage a pre-trained monocular depth model and design a lightweight spatial search optimization framework. The method exhibits scene adaptability and cross-dataset generalizability. Extensive experiments on KITTI, Waymo, and MIAS-LCEC demonstrate significant improvements over state-of-the-art methods in both calibration accuracy and robustness. Code is publicly available.

Technology Category

Application Category

πŸ“ Abstract
In this paper, we unleash the potential of the powerful monodepth model in camera-LiDAR calibration and propose CLAIM, a novel method of aligning data from the camera and LiDAR. Given the initial guess and pairs of images and LiDAR point clouds, CLAIM utilizes a coarse-to-fine searching method to find the optimal transformation minimizing a patched Pearson correlation-based structure loss and a mutual information-based texture loss. These two losses serve as good metrics for camera-LiDAR alignment results and require no complicated steps of data processing, feature extraction, or feature matching like most methods, rendering our method simple and adaptive to most scenes. We validate CLAIM on public KITTI, Waymo, and MIAS-LCEC datasets, and the experimental results demonstrate its superior performance compared with the state-of-the-art methods. The code is available at https://github.com/Tompson11/claim.
Problem

Research questions and friction points this paper is trying to address.

Aligns camera and LiDAR data using monodepth and intensity
Minimizes structure and texture losses without complex feature extraction
Validates superior performance on public datasets like KITTI and Waymo
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monodepth model for camera-LiDAR calibration
Coarse-to-fine search minimizing correlation and information losses
No complex data processing or feature matching required
Zhuo Zhang
Zhuo Zhang
Institute for Infocomm Research, A*STAR, Singapore
Bio- and Medial InformaticsData MiningMachine Learning
Y
Yonghui Liu
Mach Drive, Beijing, China
M
Meijie Zhang
Mach Drive, Beijing, China
F
Feiyang Tan
Mach Drive, Beijing, China
Yikang Ding
Yikang Ding
Tsinghua University
3D VisionGenerative Model