Bird's-Eye View to Street-View: A Survey

๐Ÿ“… 2024-05-14
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Satellite-to-street-view image synthesis faces significant challenges due to cross-view and cross-modal discrepancies in appearance and geometry. This work presents a systematic survey of state-of-the-art methods and identifies three critical gaps: (1) insufficient adaptation to modern architectures such as Transformers and diffusion models; (2) absence of large-scale, multi-city, multi-season, semantically annotated, multi-source public datasets; and (3) lack of scene-aware, task-specific evaluation metrics. To address these, we propose a unified generative framework integrating GANs, diffusion models, and VAEsโ€”featuring multi-scale feature alignment and explicit geometric prior modeling. Experiments reveal that existing approaches suffer from outdated network designs, yielding limited detail fidelity and scene diversity. Our framework establishes a reproducible technical pathway and benchmarking infrastructure for high-fidelity, generalizable street-view synthesis. (149 words)

Technology Category

Application Category

๐Ÿ“ Abstract
In recent years, street view imagery has grown to become one of the most important sources of geospatial data collection and urban analytics, which facilitates generating meaningful insights and assisting in decision-making. Synthesizing a street-view image from its corresponding satellite image is a challenging task due to the significant differences in appearance and viewpoint between the two domains. In this study, we screened 20 recent research papers to provide a thorough review of the state-of-the-art of how street-view images are synthesized from their corresponding satellite counterparts. The main findings are: (i) novel deep learning techniques are required for synthesizing more realistic and accurate street-view images; (ii) more datasets need to be collected for public usage; and (iii) more specific evaluation metrics need to be investigated for evaluating the generated images appropriately. We conclude that, due to applying outdated deep learning techniques, the recent literature failed to generate detailed and diverse street-view images.
Problem

Research questions and friction points this paper is trying to address.

Synthesizing street-view images from satellite imagery
Addressing appearance and viewpoint differences between domains
Evaluating realistic and accurate street-view image generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning techniques synthesize street-view images
Satellite imagery converted to street-view perspective
Generative models create realistic urban imagery
๐Ÿ”Ž Similar Papers
No similar papers found.
K
Khawlah Bajbaa
Department of Information and Computer Science, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia
M
Muhammad Usman
Department of Information and Computer Science, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia and SDAIA-KFUPM Joint Research Center for Artificial Intelligence, KFUPM, Dhahran, Saudi Arabia
Saeed Anwar
Saeed Anwar
University of Western Australia; Australian National University
Computer Vision3D VisionMachine learningGenerative AI
I
Ibrahim Radwan
University of Canberra, Australian Capital Territory (ACT), Canberra, Australia
Abdul Bais
Abdul Bais
University of Regina
Image ProcessingComputer VisionDeep Learning and Artificial Intelligence