🤖 AI Summary
Existing collaborative perception datasets lack 4D radar modalities, hindering robust vehicle-infrastructure cooperative perception under adverse weather conditions. To address this, we introduce V2X-Radar—the first large-scale, real-world, multimodal V2X cooperative perception dataset—featuring synchronized, high-precision integration of onboard and roadside 4D radar, LiDAR, and multi-view camera data across diverse weather (rain, fog, nighttime) and complex traffic scenarios. It comprises 20K radar frames, 20K LiDAR sweeps, 40K images, and over 350K cross-modal jointly annotated 3D bounding boxes for five object classes. We systematically organize the dataset into three subsets—V2X-Radar-C (collaborative detection), V2X-Radar-I (fusion perception), and V2X-Radar-V (V2X communication)—to support distinct research paradigms. All data, annotations, benchmark code, and evaluation results are publicly released, establishing the first comprehensive 4D-radar-enabled multimodal V2X benchmark.
📝 Abstract
Modern autonomous vehicle perception systems often struggle with occlusions and limited perception range. Previous studies have demonstrated the effectiveness of cooperative perception in extending the perception range and overcoming occlusions, thereby enhancing the safety of autonomous driving. In recent years, a series of cooperative perception datasets have emerged; however, these datasets primarily focus on cameras and LiDAR, neglecting 4D Radar, a sensor used in single-vehicle autonomous driving to provide robust perception in adverse weather conditions. In this paper, to bridge the gap created by the absence of 4D Radar datasets in cooperative perception, we present V2X-Radar, the first large-scale, real-world multi-modal dataset featuring 4D Radar. V2X-Radar dataset is collected using a connected vehicle platform and an intelligent roadside unit equipped with 4D Radar, LiDAR, and multi-view cameras. The collected data encompasses sunny and rainy weather conditions, spanning daytime, dusk, and nighttime, as well as various typical challenging scenarios. The dataset consists of 20K LiDAR frames, 40K camera images, and 20K 4D Radar data, including 350K annotated boxes across five categories. To support various research domains, we have established V2X-Radar-C for cooperative perception, V2X-Radar-I for roadside perception, and V2X-Radar-V for single-vehicle perception. Furthermore, we provide comprehensive benchmarks across these three sub-datasets. We will release all datasets and benchmark codebase at http://openmpd.com/column/V2X-Radar and https://github.com/yanglei18/V2X-Radar.