🤖 AI Summary
To address the low accuracy and poor robustness of 3D object detection in adverse weather conditions—caused by the inherent sparsity of 4D radar point clouds—this paper proposes a real-time detection method based on dense power tensor modeling. Departing from conventional sparse point cloud processing paradigms, our approach explicitly models the native four-dimensional (range–azimuth–elevation–Doppler) dense tensor structure of 4D radar data. We introduce a multi-teacher knowledge distillation framework that guides a student network in latent space to enhance feature density from sparse inputs. Furthermore, we design a lightweight real-time detection network. Evaluated on the K-Radar dataset, our method achieves a 25% improvement in mean Average Precision (mAP) over RTNH while maintaining a real-time inference speed exceeding 30 FPS. This demonstrates significantly enhanced detection reliability and safety under challenging weather conditions—including rain, fog, and snow.
📝 Abstract
Accurate 3D object detection is crucial for safe autonomous navigation, requiring reliable performance across diverse weather conditions. While LiDAR performance deteriorates in challenging weather, Radar systems maintain their reliability. Traditional Radars have limitations due to their lack of elevation data, but the recent 4D Radars overcome this by measuring elevation alongside range, azimuth, and Doppler velocity, making them invaluable for autonomous vehicles. The primary challenge in utilizing 4D Radars is the sparsity of their point clouds. Previous works address this by developing architectures that better capture semantics and context in sparse point cloud, largely drawing from LiDAR-based approaches. However, these methods often overlook a unique advantage of 4D Radars: the dense Radar tensor, which encapsulates power measurements across three spatial dimensions and the Doppler dimension. Our paper leverages this tensor to tackle the sparsity issue. We introduce a novel knowledge distillation framework that enables a student model to densify its sparse input in the latent space by emulating an ensemble of teacher models. Our experiments demonstrate a 25% performance improvement over the state-of-the-art RTNH model on the K-Radar dataset. Notably, this improvement is achieved while still maintaining a real-time inference speed.