🤖 AI Summary
Millimeter-wave radar point clouds suffer from sparsity and low resolution, severely limiting 3D perception performance in complex environments. To address this, we propose Radar-LDM—the first latent diffusion model tailored for radar—leveraging a frustum-wise LiDAR autoencoder to construct a robust latent space, incorporating order-invariant point cloud representations, and uniquely conditioning the diffusion process directly on raw radar spectrograms. This framework eliminates reliance on dense ground-truth point clouds and enables end-to-end generation of high-fidelity, dense 3D point clouds from single-frame radar spectrograms. Experiments demonstrate substantial improvements in point cloud density and geometric accuracy, significantly enhancing the robustness of autonomous driving perception under adverse weather conditions and low signal-to-noise ratios. Radar-LDM establishes a novel paradigm for radar-based 3D reconstruction.
📝 Abstract
Millimeter-wave radar offers a promising sensing modality for autonomous systems thanks to its robustness in adverse conditions and low cost. However, its utility is significantly limited by the sparsity and low resolution of radar point clouds, which poses challenges for tasks requiring dense and accurate 3D perception. Despite that recent efforts have shown great potential by exploring generative approaches to address this issue, they often rely on dense voxel representations that are inefficient and struggle to preserve structural detail. To fill this gap, we make the key observation that latent diffusion models (LDMs), though successful in other modalities, have not been effectively leveraged for radar-based 3D generation due to a lack of compatible representations and conditioning strategies. We introduce RaLD, a framework that bridges this gap by integrating scene-level frustum-based LiDAR autoencoding, order-invariant latent representations, and direct radar spectrum conditioning. These insights lead to a more compact and expressive generation process. Experiments show that RaLD produces dense and accurate 3D point clouds from raw radar spectrums, offering a promising solution for robust perception in challenging environments.