CATS-V2V: A Real-World Vehicle-to-Vehicle Cooperative Perception Dataset with Complex Adverse Traffic Scenarios

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

Existing V2V cooperative perception datasets are largely confined to routine traffic scenarios and lack high-quality multimodal data under complex adverse weather and illumination conditions, hindering the robustness of autonomous driving perception in challenging environments. To address this, we introduce the first real-world V2V cooperative perception dataset specifically designed for complex adverse traffic scenarios. It features hardware-level temporal synchronization across dual vehicles, capturing LiDAR, multi-view cameras, RTK-GNSS, and IMU data under ten distinct weather/illumination conditions. We propose a target-based temporal alignment method to achieve high-precision cross-modal spatiotemporal synchronization. Furthermore, we provide time-consistent 3D object annotations and static scene reconstruction, enabling 4D bird’s-eye-view (BEV) modeling. The dataset comprises 100 sequences, 60K LiDAR sweeps, 1.26M images, and 750K high-precision localization records—making it the largest and highest-quality V2V cooperative perception benchmark to date.

Technology Category

Application Category

📝 Abstract

Vehicle-to-Vehicle (V2V) cooperative perception has great potential to enhance autonomous driving performance by overcoming perception limitations in complex adverse traffic scenarios (CATS). Meanwhile, data serves as the fundamental infrastructure for modern autonomous driving AI. However, due to stringent data collection requirements, existing datasets focus primarily on ordinary traffic scenarios, constraining the benefits of cooperative perception. To address this challenge, we introduce CATS-V2V, the first-of-its-kind real-world dataset for V2V cooperative perception under complex adverse traffic scenarios. The dataset was collected by two hardware time-synchronized vehicles, covering 10 weather and lighting conditions across 10 diverse locations. The 100-clip dataset includes 60K frames of 10 Hz LiDAR point clouds and 1.26M multi-view 30 Hz camera images, along with 750K anonymized yet high-precision RTK-fixed GNSS and IMU records. Correspondingly, we provide time-consistent 3D bounding box annotations for objects, as well as static scenes to construct a 4D BEV representation. On this basis, we propose a target-based temporal alignment method, ensuring that all objects are precisely aligned across all sensor modalities. We hope that CATS-V2V, the largest-scale, most supportive, and highest-quality dataset of its kind to date, will benefit the autonomous driving community in related tasks.

Problem

Research questions and friction points this paper is trying to address.

Addresses data scarcity in V2V cooperative perception under complex adverse traffic scenarios

Overcomes perception limitations in autonomous driving through multi-vehicle synchronized data collection

Provides comprehensive 4D BEV representation with precise temporal alignment across sensor modalities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-world V2V dataset for complex adverse scenarios

Multi-sensor data with temporal alignment method

Largest-scale high-quality cooperative perception dataset

🔎 Similar Papers

V2X Cooperative Perception for Autonomous Driving: Recent Advances and Challenges