From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images

📅 2025-12-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of large-angle view extrapolation (up to ~90°) in city-scale 3D reconstruction from extremely off-nadir satellite imagery—characterized by minimal parallax and severe perspective compression—this paper proposes a novel method integrating geometric priors with generative modeling. First, it introduces a Z-monotonic signed distance function (SDF)-driven 2.5D height map representation that explicitly encodes building topology. Second, it designs a differentiable generative texture network to recover high-frequency facade details from severely degraded observations. Third, it jointly optimizes NeRF/3D Gaussian Splatting (3DGS) rendering losses with depth-generation constraints for end-to-end reconstruction. Using only a few sparse orbital images, the method achieves high-fidelity 3D modeling over a 4 km² urban area. It sets a new state-of-the-art in ground-level novel-view synthesis quality, significantly enhancing feasibility for urban planning and digital twin applications.

Technology Category

Application Category

📝 Abstract
City-scale 3D reconstruction from satellite imagery presents the challenge of extreme viewpoint extrapolation, where our goal is to synthesize ground-level novel views from sparse orbital images with minimal parallax. This requires inferring nearly $90^circ$ viewpoint gaps from image sources with severely foreshortened facades and flawed textures, causing state-of-the-art reconstruction engines such as NeRF and 3DGS to fail. To address this problem, we propose two design choices tailored for city structures and satellite inputs. First, we model city geometry as a 2.5D height map, implemented as a Z-monotonic signed distance field (SDF) that matches urban building layouts from top-down viewpoints. This stabilizes geometry optimization under sparse, off-nadir satellite views and yields a watertight mesh with crisp roofs and clean, vertically extruded facades. Second, we paint the mesh appearance from satellite images via differentiable rendering techniques. While the satellite inputs may contain long-range, blurry captures, we further train a generative texture restoration network to enhance the appearance, recovering high-frequency, plausible texture details from degraded inputs. Our method's scalability and robustness are demonstrated through extensive experiments on large-scale urban reconstruction. For example, in our teaser figure, we reconstruct a $4,mathrm{km}^2$ real-world region from only a few satellite images, achieving state-of-the-art performance in synthesizing photorealistic ground views. The resulting models are not only visually compelling but also serve as high-fidelity, application-ready assets for downstream tasks like urban planning and simulation.
Problem

Research questions and friction points this paper is trying to address.

Synthesize ground-level views from sparse satellite images
Address extreme viewpoint gaps in city 3D reconstruction
Enhance degraded textures for photorealistic urban modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

2.5D height map with Z-monotonic SDF for stable geometry
Differentiable rendering to paint mesh appearance from satellite images
Generative texture restoration network to enhance degraded inputs
🔎 Similar Papers
No similar papers found.