🤖 AI Summary
This work addresses two key challenges in aerial-ground cross-view novel view synthesis: (1) modeling large-scale altitude variations and dynamic objects, and (2) insufficient coverage of real-world scenarios in existing datasets. To this end, we introduce ACC-NVS1—a new benchmark dataset comprising 148,000 synchronized aerial-ground images captured across six real-world urban scenes in Austin and Pittsburgh during 2023–2024. ACC-NVS1 is the first to systematically integrate multi-source imagery from drones, vehicles, and static ground platforms. It features high-precision spatiotemporal calibration, pixel-level dynamic object masks, and multi-view consistency validation, explicitly encoding multi-scale height disparities and transient motion. The dataset provides accurate camera poses, dense scene geometry, and fine-grained dynamic annotations. By bridging the gap in cross-platform view synthesis data, ACC-NVS1 enables robust learning of coupled 3D geometry-motion priors, significantly improving model generalization for cross-view synthesis under complex altitude and motion conditions.
📝 Abstract
This paper introduces ACC-NVS1, a specialized dataset designed for research on Novel View Synthesis specifically for airborne and ground imagery. Data for ACC-NVS1 was collected in Austin, TX and Pittsburgh, PA in 2023 and 2024. The collection encompasses six diverse real-world scenes captured from both airborne and ground cameras, resulting in a total of 148,000 images. ACC-NVS1 addresses challenges such as varying altitudes and transient objects. This dataset is intended to supplement existing datasets, providing additional resources for comprehensive research, rather than serving as a benchmark.