🤖 AI Summary
In autonomous driving, multi-sensor fusion of camera, LiDAR, and 4D radar suffers from poor robustness under partial modality degradation or failure, high computational overhead, and inconsistent cross-modal feature representation. To address these challenges, this paper proposes an availability-aware fusion framework. Its core contributions are: (1) Unified Canonical Projection (UCP), which enforces geometrically consistent feature alignment across modalities in a shared canonical space; and (2) Camera-agnostic Sensor Attention via Patches (CASAP), an efficient, patch-based cross-sensor attention mechanism that jointly enhances robustness and inference efficiency. The method avoids modality reconstruction and eliminates redundant parameters. Evaluated on the K-Radar dataset, it achieves 87.2% BEV detection AP (+9.7% over SOTA) and 73.6% 3D detection AP (+20.1%), significantly outperforming state-of-the-art methods while maintaining low inference latency.
📝 Abstract
Sensor fusion of camera, LiDAR, and 4-dimensional (4D) Radar has brought a significant performance improvement in autonomous driving (AD). However, there still exist fundamental challenges: deeply coupled fusion methods assume continuous sensor availability, making them vulnerable to sensor degradation and failure, whereas sensor-wise cross-attention fusion methods struggle with computational cost and unified feature representation. This paper presents availability-aware sensor fusion (ASF), a novel method that employs unified canonical projection (UCP) to enable consistency in all sensor features for fusion and cross-attention across sensors along patches (CASAP) to enhance robustness of sensor fusion against sensor degradation and failure. As a result, the proposed ASF shows a superior object detection performance to the existing state-of-the-art fusion methods under various weather and sensor degradation (or failure) conditions; Extensive experiments on the K-Radar dataset demonstrate that ASF achieves improvements of 9.7% in AP BEV (87.2%) and 20.1% in AP 3D (73.6%) in object detection at IoU=0.5, while requiring a low computational cost. The code will be available at https://github.com/kaist-avelab/K-Radar.