SB-BEVFusion: Enhancing the Robustness against Sensor Malfunction and Corruptions

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Existing multimodal fusion methods suffer significant performance degradation when camera or LiDAR data are missing, corrupted, or perturbed by noise, revealing a critical lack of robustness to sensor failures. To address this limitation, this work proposes a framework-agnostic, plug-and-play robust camera–LiDAR fusion module that dynamically handles intact, missing, or degraded inputs through a modality-adaptive strategy within a unified bird’s-eye-view (BEV) space. The module seamlessly integrates with mainstream BEV fusion frameworks and demonstrates substantial improvements over state-of-the-art methods on the MultiCorrupt benchmark across diverse sensor corruptions and adverse weather conditions, significantly enhancing both the stability and accuracy of 3D object detection.

📝 Abstract

Multimodal sensor fusion has demonstrated remarkable performance improvements over unimodal approaches in 3D object detection for autonomous vehicles. Typically, existing methods transform multimodal data from independent sensors, such as camera and LiDAR, into a unified bird's-eye view (BEV) representation for fusion. Although effective in ideal conditions, this strategy suffers from substantial performance deterioration when camera or LiDAR data are missing, corrupted, or noisy. To address this vulnerability, we develop a framework-agnostic fusion module for camera and LiDAR data that allows for handling cases when one of the two modalities is missing or corrupted. To demonstrate the effectiveness of our module, we instantiate it in BEVFusion [1], a well-established framework to combine camera and LiDAR data for 3D object detection. By means of quantitative experiments on the MultiCorrupt dataset, we demonstrate that our module achieves favorable performance improvements under scenarios of missing and corrupted modalities, substantially outperforming existing unified representation approaches across a wide range of sensor deterioration scenarios and reaching state-of-the-art performance in scenarios of corrupted modality due to extreme weather conditions and sensor failure.

Problem

Research questions and friction points this paper is trying to address.

sensor malfunction

data corruption

multimodal fusion

3D object detection

robustness

Innovation

Methods, ideas, or system contributions that make the work stand out.

sensor robustness

multimodal fusion

BEV representation