BEVCALIB: LiDAR-Camera Calibration via Geometry-Guided Bird's-Eye View Representations

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Existing LiDAR–camera extrinsic calibration methods for autonomous driving rely heavily on artificial calibration targets, controlled environments, and static scene assumptions—limiting practical deployment. To address these limitations, this paper proposes the first end-to-end joint calibration method leveraging bird’s-eye-view (BEV) features. Our core contributions are: (i) the first integration of BEV representation into extrinsic calibration; (ii) a geometry-guided BEV feature selector enabling cross-modal feature alignment and differentiable transformation regression; and (iii) support for raw sensor inputs and dynamic in-motion calibration. The method jointly extracts multi-modal BEV features using CNNs and PointPillars, performs cross-modal spatial alignment, and applies geometrically aware feature selection. On KITTI and nuScenes, our approach reduces translation and rotation errors by 47.08% and 82.32%, respectively, over the best prior baseline; open-source implementation achieves an order-of-magnitude performance gain.

Technology Category

Application Category

📝 Abstract

Accurate LiDAR-camera calibration is fundamental to fusing multi-modal perception in autonomous driving and robotic systems. Traditional calibration methods require extensive data collection in controlled environments and cannot compensate for the transformation changes during the vehicle/robot movement. In this paper, we propose the first model that uses bird's-eye view (BEV) features to perform LiDAR camera calibration from raw data, termed BEVCALIB. To achieve this, we extract camera BEV features and LiDAR BEV features separately and fuse them into a shared BEV feature space. To fully utilize the geometric information from the BEV feature, we introduce a novel feature selector to filter the most important features in the transformation decoder, which reduces memory consumption and enables efficient training. Extensive evaluations on KITTI, NuScenes, and our own dataset demonstrate that BEVCALIB establishes a new state of the art. Under various noise conditions, BEVCALIB outperforms the best baseline in the literature by an average of (47.08%, 82.32%) on KITTI dataset, and (78.17%, 68.29%) on NuScenes dataset, in terms of (translation, rotation), respectively. In the open-source domain, it improves the best reproducible baseline by one order of magnitude. Our code and demo results are available at https://cisl.ucr.edu/BEVCalib.

Problem

Research questions and friction points this paper is trying to address.

Accurate LiDAR-camera calibration for autonomous systems

Compensating transformation changes during vehicle movement

Reducing memory consumption with efficient feature selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses BEV features for LiDAR-camera calibration

Introduces novel feature selector for efficiency

Achieves state-of-the-art performance on datasets

🔎 Similar Papers

GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection