SB-BEVFusion: Enhancing the Robustness against Sensor Malfunction and Corruptions

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

225K/year
🤖 AI Summary
Existing multimodal fusion methods suffer significant performance degradation when camera or LiDAR data are missing, corrupted, or perturbed by noise, revealing a critical lack of robustness to sensor failures. To address this limitation, this work proposes a framework-agnostic, plug-and-play robust camera–LiDAR fusion module that dynamically handles intact, missing, or degraded inputs through a modality-adaptive strategy within a unified bird’s-eye-view (BEV) space. The module seamlessly integrates with mainstream BEV fusion frameworks and demonstrates substantial improvements over state-of-the-art methods on the MultiCorrupt benchmark across diverse sensor corruptions and adverse weather conditions, significantly enhancing both the stability and accuracy of 3D object detection.
📝 Abstract
Multimodal sensor fusion has demonstrated remarkable performance improvements over unimodal approaches in 3D object detection for autonomous vehicles. Typically, existing methods transform multimodal data from independent sensors, such as camera and LiDAR, into a unified bird's-eye view (BEV) representation for fusion. Although effective in ideal conditions, this strategy suffers from substantial performance deterioration when camera or LiDAR data are missing, corrupted, or noisy. To address this vulnerability, we develop a framework-agnostic fusion module for camera and LiDAR data that allows for handling cases when one of the two modalities is missing or corrupted. To demonstrate the effectiveness of our module, we instantiate it in BEVFusion [1], a well-established framework to combine camera and LiDAR data for 3D object detection. By means of quantitative experiments on the MultiCorrupt dataset, we demonstrate that our module achieves favorable performance improvements under scenarios of missing and corrupted modalities, substantially outperforming existing unified representation approaches across a wide range of sensor deterioration scenarios and reaching state-of-the-art performance in scenarios of corrupted modality due to extreme weather conditions and sensor failure.
Problem

Research questions and friction points this paper is trying to address.

sensor malfunction
data corruption
multimodal fusion
3D object detection
robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

sensor robustness
multimodal fusion
BEV representation
missing modality handling
3D object detection