R4Det: 4D Radar-Camera Fusion for High-Performance 3D Object Detection

πŸ“… 2026-03-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing 4D radar–camera fusion methods suffer from limitations in depth estimation robustness, reliance on ego-vehicle pose for temporal modeling, and failure to detect small objects due to sparse radar points. To address these issues, this work proposes a panoptic deep fusion strategy to enhance depth quality, introduces a deformable gated temporal fusion module that operates without ego-vehicle pose information, and incorporates a 2D instance-guided dynamic refinement mechanism. These innovations collectively mitigate inaccuracies in depth estimation, eliminate dependency on pose data, and reduce missed detections of small targets. The proposed approach achieves state-of-the-art 3D object detection performance on both the TJ4DRadSet and VoD benchmarks.

Technology Category

Application Category

πŸ“ Abstract
4D radar-camera sensing configuration has gained increasing importance in autonomous driving. However, existing 3D object detection methods that fuse 4D Radar and camera data confront several challenges. First, their absolute depth estimation module is not robust and accurate enough, leading to inaccurate 3D localization. Second, the performance of their temporal fusion module will degrade dramatically or even fail when the ego vehicle's pose is missing or inaccurate. Third, for some small objects, the sparse radar point clouds may completely fail to reflect from their surfaces. In such cases, detection must rely solely on visual unimodal priors. To address these limitations, we propose R4Det, which enhances depth estimation quality via the Panoramic Depth Fusion module, enabling mutual reinforcement between absolute and relative depth. For temporal fusion, we design a Deformable Gated Temporal Fusion module that does not rely on the ego vehicle's pose. In addition, we built an Instance-Guided Dynamic Refinement module that extracts semantic prototypes from 2D instance guidance. Experiments show that R4Det achieves state-of-the-art 3D object detection results on the TJ4DRadSet and VoD datasets.
Problem

Research questions and friction points this paper is trying to address.

3D object detection
4D radar-camera fusion
depth estimation
temporal fusion
sparse radar point clouds
Innovation

Methods, ideas, or system contributions that make the work stand out.

4D radar-camera fusion
Panoramic Depth Fusion
Deformable Gated Temporal Fusion
Instance-Guided Dynamic Refinement
3D object detection
πŸ”Ž Similar Papers
No similar papers found.
Z
Zhongyu Xia
Wangxuan Institute of Computer Technology, Peking University
Y
Yousen Tang
Wangxuan Institute of Computer Technology, Peking University
Y
Yongtao Wang
Wangxuan Institute of Computer Technology, Peking University
Zhifeng Wang
Zhifeng Wang
Liaoning University
economics
W
Weijun Qin
EBTech Co. Ltd