REFNet++: Multi-Task Efficient Fusion of Camera and Radar Sensor Data in Bird's-Eye Polar View

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

This work addresses the challenges of high radar noise and limited classification capability in camera–radar multimodal fusion under adverse weather conditions. To this end, the authors propose an end-to-end variational encoder–decoder architecture that jointly models front-view images and raw radar range–Doppler spectra in the bird’s-eye-view polar coordinate domain. The method achieves cross-modal feature fusion through unified spatial alignment, simultaneously recovers radar angular information, and reconstructs range–azimuth features, thereby enhancing perception accuracy while maintaining computational efficiency. Experimental results on the RADIal dataset demonstrate that the proposed approach outperforms state-of-the-art methods in both vehicle detection and free-space segmentation tasks, confirming its effectiveness and efficiency.

📝 Abstract

A realistic view of the vehicle's surroundings is generally offered by camera sensors, which is crucial for environmental perception. Affordable radar sensors, on the other hand, are becoming invaluable due to their robustness in variable weather conditions. However, because of their noisy output and reduced classification capability, they work best when combined with other sensor data. Specifically, we address the challenge of multimodal sensor fusion by aligning radar and camera data in a unified domain, prioritizing not only accuracy, but also computational efficiency. Our work leverages the raw range-Doppler (RD) spectrum from radar and front-view camera images as inputs. To enable effective fusion, we employ a variational encoder-decoder architecture that learns the transformation of front-view camera data into the Bird's-Eye View (BEV) polar domain. Concurrently, a radar encoder-decoder learns to recover the angle information from the RD data that produce Range-Azimuth (RA) features. This alignment ensures that both modalities are represented in a compatible domain, facilitating robust and efficient sensor fusion. We evaluated our fusion strategy for vehicle detection and free space segmentation against state-of-the-art methods using the RADIal dataset.

Problem

Research questions and friction points this paper is trying to address.

sensor fusion

camera-radar alignment

Bird's-Eye View

multimodal perception

efficient fusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-task fusion

bird's-eye polar view

range-Doppler spectrum