🤖 AI Summary
This work addresses the challenges of extrinsic calibration between 4D radar and cameras, as well as the high cost and low reliability of radar point cloud annotation. The authors propose a unified framework featuring a dual-purpose calibration target—combining a front-facing checkerboard and rear-mounted corner reflectors—to achieve high-precision extrinsic calibration between radar and camera sensors. Leveraging the calibrated parameters, they develop a cross-modal feature alignment mechanism that automatically projects image segmentation labels onto radar point clouds, establishing an efficient and automated annotation pipeline. This approach substantially reduces manual labeling effort, enhances the development efficiency of multimodal perception systems, and provides a reliable data foundation for cross-modal perception research.
📝 Abstract
4D radar has emerged as a critical sensor for autonomous driving, primarily due to its enhanced capabilities in elevation measurement and higher resolution compared to traditional 3D radar. Effective integration of 4D radar with cameras requires accurate extrinsic calibration, and the development of radar-based perception algorithms demands large-scale annotated datasets. However, existing calibration methods often employ separate targets optimized for either visual or radar modalities, complicating correspondence establishment. Furthermore, manually labeling sparse radar data is labor-intensive and unreliable. To address these challenges, we propose 4D-CAAL, a unified framework for 4D radar-camera calibration and auto-labeling. Our approach introduces a novel dual-purpose calibration target design, integrating a checkerboard pattern on the front surface for camera detection and a corner reflector at the center of the back surface for radar detection. We develop a robust correspondence matching algorithm that aligns the checkerboard center with the strongest radar reflection point, enabling accurate extrinsic calibration. Subsequently, we present an auto-labeling pipeline that leverages the calibrated sensor relationship to transfer annotations from camera-based segmentations to radar point clouds through geometric projection and multi-feature optimization. Extensive experiments demonstrate that our method achieves high calibration accuracy while significantly reducing manual annotation effort, thereby accelerating the development of robust multi-modal perception systems for autonomous driving.