CalibRefine: Deep Learning-Based Online Automatic Targetless LiDAR-Camera Calibration with Iterative and Attention-Driven Post-Refinement

📅 2025-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LiDAR–camera calibration methods rely on artificial targets, manual initial pose estimation, or intensive preprocessing, limiting generalizability and online applicability. To address this, we propose a fully self-supervised, target-agnostic online joint calibration framework. Our method operates directly on raw point clouds and images, establishing cross-modal correspondences via object-level feature matching, and integrates a four-stage pipeline: coarse alignment, homography-based initialization, incremental iterative optimization, and ViT-based cross-attention refinement. We introduce the first purely data-driven, end-to-end dynamic extrinsic parameter optimization mechanism—requiring no calibration targets, prior initialization, ground-truth supervision, or human intervention. Evaluated on urban traffic datasets, our approach achieves sub-pixel reprojection accuracy, matching or surpassing manually calibrated baselines—while performing zero preprocessing, zero ground-truth usage, and zero manual involvement throughout the entire process.

Technology Category

Application Category

📝 Abstract
Accurate multi-sensor calibration is essential for deploying robust perception systems in applications such as autonomous driving, robotics, and intelligent transportation. Existing LiDAR-camera calibration methods often rely on manually placed targets, preliminary parameter estimates, or intensive data preprocessing, limiting their scalability and adaptability in real-world settings. In this work, we propose a fully automatic, targetless, and online calibration framework, CalibRefine, which directly processes raw LiDAR point clouds and camera images. Our approach is divided into four stages: (1) a Common Feature Discriminator that trains on automatically detected objects--using relative positions, appearance embeddings, and semantic classes--to generate reliable LiDAR-camera correspondences, (2) a coarse homography-based calibration, (3) an iterative refinement to incrementally improve alignment as additional data frames become available, and (4) an attention-based refinement that addresses non-planar distortions by leveraging a Vision Transformer and cross-attention mechanisms. Through extensive experiments on two urban traffic datasets, we show that CalibRefine delivers high-precision calibration results with minimal human involvement, outperforming state-of-the-art targetless methods and remaining competitive with, or surpassing, manually tuned baselines. Our findings highlight how robust object-level feature matching, together with iterative and self-supervised attention-based adjustments, enables consistent sensor fusion in complex, real-world conditions without requiring ground-truth calibration matrices or elaborate data preprocessing.
Problem

Research questions and friction points this paper is trying to address.

Automatic LiDAR-camera calibration
Targetless and online refinement
High-precision sensor fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Targetless LiDAR-Camera calibration framework.
Iterative refinement with additional data frames.
Attention-based refinement using Vision Transformer.
🔎 Similar Papers
No similar papers found.
L
Lei Cheng
Department of Electrical and Computer Engineering, University of Arizona, 1200 E. University Blvd, Tucson, 85721, AZ, USA
L
Lihao Guo
Department of Electrical and Computer Engineering, University of Arizona, 1200 E. University Blvd, Tucson, 85721, AZ, USA
T
Tam Bang
Center For Urban Informatics and Progress (CUIP), UTC Research Institute, University of Tennessee at Chattanooga, 615 McCallie Avenue, Chattanooga, 37405, TN, USA
Austin Harris
Austin Harris
Center For Urban Informatics and Progress (CUIP), UTC Research Institute, University of Tennessee at Chattanooga, 615 McCallie Avenue, Chattanooga, 37405, TN, USA
Mustafa Hajij
Mustafa Hajij
Assistant Professor of Machine Learning, University of San Francisco
Artificial IntelligenceTopological Deep LearningTopological Neural Networks
M
Mina Sartipi
Center For Urban Informatics and Progress (CUIP), UTC Research Institute, University of Tennessee at Chattanooga, 615 McCallie Avenue, Chattanooga, 37405, TN, USA
Siyang Cao
Siyang Cao
The University of Arizona
Waveform DesignMIMO RadarSensor FusionMachine Learning on SensorsSignal Processing