Bird's-Eye View to Street-View: A Survey

📅 2024-05-14

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Satellite-to-street-view image synthesis faces significant challenges due to cross-view and cross-modal discrepancies in appearance and geometry. This work presents a systematic survey of state-of-the-art methods and identifies three critical gaps: (1) insufficient adaptation to modern architectures such as Transformers and diffusion models; (2) absence of large-scale, multi-city, multi-season, semantically annotated, multi-source public datasets; and (3) lack of scene-aware, task-specific evaluation metrics. To address these, we propose a unified generative framework integrating GANs, diffusion models, and VAEs—featuring multi-scale feature alignment and explicit geometric prior modeling. Experiments reveal that existing approaches suffer from outdated network designs, yielding limited detail fidelity and scene diversity. Our framework establishes a reproducible technical pathway and benchmarking infrastructure for high-fidelity, generalizable street-view synthesis. (149 words)

Technology Category

Application Category

📝 Abstract

In recent years, street view imagery has grown to become one of the most important sources of geospatial data collection and urban analytics, which facilitates generating meaningful insights and assisting in decision-making. Synthesizing a street-view image from its corresponding satellite image is a challenging task due to the significant differences in appearance and viewpoint between the two domains. In this study, we screened 20 recent research papers to provide a thorough review of the state-of-the-art of how street-view images are synthesized from their corresponding satellite counterparts. The main findings are: (i) novel deep learning techniques are required for synthesizing more realistic and accurate street-view images; (ii) more datasets need to be collected for public usage; and (iii) more specific evaluation metrics need to be investigated for evaluating the generated images appropriately. We conclude that, due to applying outdated deep learning techniques, the recent literature failed to generate detailed and diverse street-view images.

Problem

Research questions and friction points this paper is trying to address.

Synthesizing street-view images from satellite imagery

Addressing appearance and viewpoint differences between domains

Evaluating realistic and accurate street-view image generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning techniques synthesize street-view images

Satellite imagery converted to street-view perspective

Generative models create realistic urban imagery

🔎 Similar Papers

No similar papers found.