FlyAwareV2: A Multimodal Cross-Domain UAV Dataset for Urban Scene Understanding

๐Ÿ“… 2025-10-15
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Urban drone vision algorithms are hindered by scarcity of real-world annotated data and high labeling costs. Method: This paper introduces the first large-scale, multimodal drone dataset for urban scene understanding, integrating real and synthetic RGB images, depth maps, and semantic labelsโ€”spanning diverse weather conditions and day/night scenarios. It proposes a novel method to generate high-fidelity monocular depth maps for real drone imagery and establishes the first cross-domain adaptation benchmark to systematically evaluate synthetic-to-real generalization. Contribution/Results: The dataset and benchmark enable rigorous evaluation of model robustness and generalization, demonstrating significant improvements in both RGB-only and multimodal semantic segmentation tasks. This work advances synthetic-data-driven drone perception research and releases the dataset and benchmark publicly.

Technology Category

Application Category

๐Ÿ“ Abstract
The development of computer vision algorithms for Unmanned Aerial Vehicle (UAV) applications in urban environments heavily relies on the availability of large-scale datasets with accurate annotations. However, collecting and annotating real-world UAV data is extremely challenging and costly. To address this limitation, we present FlyAwareV2, a novel multimodal dataset encompassing both real and synthetic UAV imagery tailored for urban scene understanding tasks. Building upon the recently introduced SynDrone and FlyAware datasets, FlyAwareV2 introduces several new key contributions: 1) Multimodal data (RGB, depth, semantic labels) across diverse environmental conditions including varying weather and daytime; 2) Depth maps for real samples computed via state-of-the-art monocular depth estimation; 3) Benchmarks for RGB and multimodal semantic segmentation on standard architectures; 4) Studies on synthetic-to-real domain adaptation to assess the generalization capabilities of models trained on the synthetic data. With its rich set of annotations and environmental diversity, FlyAwareV2 provides a valuable resource for research on UAV-based 3D urban scene understanding.
Problem

Research questions and friction points this paper is trying to address.

Addressing UAV data scarcity through multimodal real-synthetic dataset
Providing environmental diversity with weather/daytime variations for urban scenes
Enabling domain adaptation studies between synthetic and real UAV data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal dataset with real and synthetic UAV imagery
Depth maps generated via monocular depth estimation
Benchmarks for RGB and multimodal semantic segmentation
๐Ÿ”Ž Similar Papers
No similar papers found.