🤖 AI Summary
This work addresses the challenge of extrinsic calibration between a camera and a LiDAR sensor without relying on calibration targets or prior knowledge of camera intrinsics—a limitation of existing methods that typically assume known and rectified camera parameters. We propose the first joint calibration framework that simultaneously estimates both the full set of camera intrinsics, including radial and tangential distortion coefficients, and the camera–LiDAR extrinsic parameters, using deep pixel-to-point correspondences. Our approach integrates Structure-from-Motion (SfM) for intrinsic initialization and tightly couples correspondence estimation with nonlinear optimization. Experiments on the KITTI dataset demonstrate that the method accurately recovers intrinsic parameters and achieves superior extrinsic calibration accuracy compared to state-of-the-art alternatives.
📝 Abstract
Accurate camera-LiDAR calibration is a prerequisite for robust multi-modal perception in robotics. Recent target-less approaches based on deep point correspondences achieve remarkable performance for extrinsic calibration but assume rectified images with known intrinsics. In this work, we overcome this limitation and present the first fully target-less pipeline that jointly estimates camera intrinsics (pinhole model with radial-tangential distortion) and camera-LiDAR extrinsics with deep pixel-point correspondences. Our approach extends deep correspondence-based calibration by (i) automatic intrinsic initialization via structure-from-motion, (ii) generalizing camera-LiDAR matching to raw images with unknown intrinsics including distortion, and (iii) tightly coupling correspondence estimation with joint nonlinear optimization over both intrinsics and extrinsics. We evaluate our method on the KITTI dataset with unseen camera-LiDAR pairs and demonstrate that joint calibration achieves improved extrinsic accuracy while additionally recovering accurate intrinsics.