🤖 AI Summary
To address the challenges of high noise levels and weak discriminative features in radar-based target detection, this paper proposes a novel paradigm integrating semantic 3D urban models with self-supervised radar–image learning. First, we introduce RadarCity—the first large-scale dataset featuring synchronized radar–image pairs aligned with open semantic 3D urban models (e.g., OpenStreetMap + CityGML). Second, we design RADLER: a detection framework that employs contrastive self-supervised pretraining on radar–image pairs to learn robust multimodal representations, and incorporates geometrically structured depth priors—generated by projecting semantic 3D models—into the detection head to enable geometry-guided precise localization and classification. This work is the first to leverage open semantic 3D urban models as structured geometric priors for radar detection, establishing a synergistic framework unifying self-supervised pretraining and deep semantic integration. On RadarCity, RADLER achieves absolute improvements of +5.46% in mAP and +3.51% in mAR over state-of-the-art radar-only methods.
📝 Abstract
Semantic 3D city models are worldwide easy-accessible, providing accurate, object-oriented, and semantic-rich 3D priors. To date, their potential to mitigate the noise impact on radar object detection remains under-explored. In this paper, we first introduce a unique dataset, RadarCity, comprising 54K synchronized radar-image pairs and semantic 3D city models. Moreover, we propose a novel neural network, RADLER, leveraging the effectiveness of contrastive self-supervised learning (SSL) and semantic 3D city models to enhance radar object detection of pedestrians, cyclists, and cars. Specifically, we first obtain the robust radar features via a SSL network in the radar-image pretext task. We then use a simple yet effective feature fusion strategy to incorporate semantic-depth features from semantic 3D city models. Having prior 3D information as guidance, RADLER obtains more fine-grained details to enhance radar object detection. We extensively evaluate RADLER on the collected RadarCity dataset and demonstrate average improvements of 5.46% in mean avarage precision (mAP) and 3.51% in mean avarage recall (mAR) over previous radar object detection methods. We believe this work will foster further research on semantic-guided and map-supported radar object detection. Our project page is publicly available athttps://gpp-communication.github.io/RADLER .