Depth-aware Fusion Method based on Image and 4D Radar Spectrum for 3D Object Detection

📅 2025-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the insufficient robustness of autonomous vehicle 3D detection under adverse weather conditions, this paper proposes a complementary BEV-space fusion method integrating 4D millimeter-wave radar and camera modalities. The method comprises four key components: 4D radar signal processing, BEV feature projection, GAN-based depth estimation, and multi-sensor geometric alignment. Its core contributions are: (1) a novel GAN-based paradigm that synthesizes depth maps directly from radar spectrograms—mitigating modality absence in the absence of dedicated depth sensors; and (2) a depth-guided cross-modal attention mechanism that jointly encodes sparse radar point clouds and dense visual semantic features within a unified BEV representation. Evaluated on a real-world automotive dataset, the approach achieves a 12.6% improvement in 3D detection mAP and reduces false positive rates by 37% under adverse weather (rain, snow, fog), significantly enhancing environmental robustness.

Technology Category

Application Category

📝 Abstract
Safety and reliability are crucial for the public acceptance of autonomous driving. To ensure accurate and reliable environmental perception, intelligent vehicles must exhibit accuracy and robustness in various environments. Millimeter-wave radar, known for its high penetration capability, can operate effectively in adverse weather conditions such as rain, snow, and fog. Traditional 3D millimeter-wave radars can only provide range, Doppler, and azimuth information for objects. Although the recent emergence of 4D millimeter-wave radars has added elevation resolution, the radar point clouds remain sparse due to Constant False Alarm Rate (CFAR) operations. In contrast, cameras offer rich semantic details but are sensitive to lighting and weather conditions. Hence, this paper leverages these two highly complementary and cost-effective sensors, 4D millimeter-wave radar and camera. By integrating 4D radar spectra with depth-aware camera images and employing attention mechanisms, we fuse texture-rich images with depth-rich radar data in the Bird's Eye View (BEV) perspective, enhancing 3D object detection. Additionally, we propose using GAN-based networks to generate depth images from radar spectra in the absence of depth sensors, further improving detection accuracy.
Problem

Research questions and friction points this paper is trying to address.

Enhance 3D object detection accuracy
Fuse 4D radar and camera data
Operate in adverse weather conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses 4D radar and camera data
Uses attention mechanisms for integration
Employs GANs for depth image generation
🔎 Similar Papers
No similar papers found.
Y
Yue Sun
The Global Institute of Future Technology, Shanghai Jiao Tong University, Shanghai, 200240, China
Yeqiang Qian
Yeqiang Qian
Shanghai Jiao Tong University
intelligent vehiclecomputer vision
C
Chunxiang Wang
The Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China; Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
M
Ming Yang
The Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China; Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China