AGC-Drive: A Large-Scale Dataset for Real-World Aerial-Ground Collaboration in Driving Scenarios

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

To address perception bottlenecks in autonomous driving—particularly severe occlusion and the lack of overhead-view supervision—this paper introduces UAVDrive, the first large-scale, real-world, air-ground collaborative 3D perception dataset. It captures 14 complex road scenarios and dynamic interaction events, with synchronized multimodal data from two ground vehicles (each equipped with five cameras and one LiDAR) and one UAV (front-facing camera and LiDAR), achieving millisecond-level spatiotemporal alignment and full-frame 3D annotations for 13 object classes. We propose a novel air-ground collaborative perception paradigm and establish two benchmark tasks: V2V-UAV and V2UAV 3D detection and tracking. The dataset comprises 120K LiDAR sweeps, 440K images, and 400 complete scenes, accompanied by an open-source, cross-platform toolchain for calibration, visualization, and collaborative annotation. UAVDrive and its tools are publicly released to advance occlusion-resilient, globally aware perception in autonomous driving.

Technology Category

Application Category

📝 Abstract

By sharing information across multiple agents, collaborative perception helps autonomous vehicles mitigate occlusions and improve overall perception accuracy. While most previous work focus on vehicle-to-vehicle and vehicle-to-infrastructure collaboration, with limited attention to aerial perspectives provided by UAVs, which uniquely offer dynamic, top-down views to alleviate occlusions and monitor large-scale interactive environments. A major reason for this is the lack of high-quality datasets for aerial-ground collaborative scenarios. To bridge this gap, we present AGC-Drive, the first large-scale real-world dataset for Aerial-Ground Cooperative 3D perception. The data collection platform consists of two vehicles, each equipped with five cameras and one LiDAR sensor, and one UAV carrying a forward-facing camera and a LiDAR sensor, enabling comprehensive multi-view and multi-agent perception. Consisting of approximately 120K LiDAR frames and 440K images, the dataset covers 14 diverse real-world driving scenarios, including urban roundabouts, highway tunnels, and on/off ramps. Notably, 19.5% of the data comprises dynamic interaction events, including vehicle cut-ins, cut-outs, and frequent lane changes. AGC-Drive contains 400 scenes, each with approximately 100 frames and fully annotated 3D bounding boxes covering 13 object categories. We provide benchmarks for two 3D perception tasks: vehicle-to-vehicle collaborative perception and vehicle-to-UAV collaborative perception. Additionally, we release an open-source toolkit, including spatiotemporal alignment verification tools, multi-agent visualization systems, and collaborative annotation utilities. The dataset and code are available at https://github.com/PercepX/AGC-Drive.

Problem

Research questions and friction points this paper is trying to address.

Lack of high-quality datasets for aerial-ground collaborative perception scenarios

Need for dynamic top-down views to alleviate occlusions in autonomous driving

Limited research on UAV perspectives in multi-agent collaborative perception

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale aerial-ground collaborative 3D perception dataset

Multi-agent platform with vehicles and UAV sensors

Open-source toolkit for spatiotemporal alignment and visualization

🔎 Similar Papers

AeroVerse: UAV-Agent Benchmark Suite for Simulating, Pre-training, Finetuning, and Evaluating Aerospace Embodied World Models