๐ค AI Summary
Existing satellite-image-based 3D building generation methods struggle to accurately reconstruct structural geometry from a single overhead view, while mainstream detail refinement approaches heavily rely on high-fidelity voxel inputs and cannot produce high-quality models from simple geometric priors (e.g., cuboids). To address this, we propose an end-to-end generative framework that jointly leverages satellite imagery and coarse geometric priors (e.g., bounding boxes), enabling flexible and lightweight structural control via noise-interpolation-driven geometric modeling transformations. We introduce Skylines-50Kโthe first large-scale, stylized 3D building datasetโand design a deep generative network that achieves coarse-to-fine geometric refinement without additional computational overhead. Experiments demonstrate strong generalization across diverse architectural forms, yielding 3D building models with accurate topology, rich surface details, and high fidelity.
๐ Abstract
We present SatSkylines, a 3D building generation approach that takes satellite imagery and coarse geometric priors. Without proper geometric guidance, existing image-based 3D generation methods struggle to recover accurate building structures from the top-down views of satellite images alone. On the other hand, 3D detailization methods tend to rely heavily on highly detailed voxel inputs and fail to produce satisfying results from simple priors such as cuboids. To address these issues, our key idea is to model the transformation from interpolated noisy coarse priors to detailed geometries, enabling flexible geometric control without additional computational cost. We have further developed Skylines-50K, a large-scale dataset of over 50,000 unique and stylized 3D building assets in order to support the generations of detailed building models. Extensive evaluations indicate the effectiveness of our model and strong generalization ability.