🤖 AI Summary
This work addresses the challenge of distinguishing pose-informative observations from noisy, ineffective ones in online multimodal calibration. The authors propose a support-graph-driven calibration framework that decouples the process into four stages: initial calibration, cross-modal residual extraction, support graph estimation, and support-aware optimization. They introduce a novel dense calibration support graph mechanism that explicitly identifies spatially and semantically more reliable calibration regions by aggregating cross-modal consistency. The method integrates MDPCalib’s targetless calibration with the CMRNext dense matching model, leveraging motion-depth point correspondences and optical-flow-like image-plane residuals to construct the support graph. Experiments on the BLT and KITTI datasets demonstrate the non-uniformity of calibration evidence, and show that support-graph-guided optimization significantly improves translation accuracy while also enhancing rotational precision.
📝 Abstract
Reliable multi-modal calibration requires identifying which observations truly constrain the extrinsic parameters and which ones mainly add noise or ambiguity. In this paper, we propose a support-map-driven approach to multi-modal calibration that decouples four functional blocks: initial calibration, cross-modal residual extraction, support-map estimation, and support-aware refinement. We instantiate this formulation for online LiDAR--camera calibration using MDPCalib, a target-less LiDAR--camera calibration method based on motion and deep point correspondences, and CMRNext, a dense LiDAR--camera matching model that predicts optical-flow-like image-plane residuals. The key contribution is a dense calibration support map that aggregates cross-modal agreement over aligned observations and highlights where calibration evidence is consistently reliable. Across the Bacchus Long-Term (BLT) dataset and KITTI, we show that calibration evidence is spatially and semantically non-uniform, indicating that some semantic regions provide stronger cues for calibration than others. On KITTI, support-guided refinement improves the calibration performance with better translation accuracy while rotational gains remain limited.