🤖 AI Summary
Existing methods for automatic building footprint reconstruction from single satellite imagery suffer from low accuracy, geometric irregularities, and heavy reliance on manual post-processing. To address these limitations, this paper proposes an end-to-end framework integrating attraction field map guidance, multi-scale feature aggregation, and graph convolutional networks (GCNs). Specifically, we embed an attraction field map—encoding structural priors—into a Transformer backbone to explicitly model building topology; employ GCNs to refine boundary connectivity and enforce topological consistency; and fuse multi-resolution features to enhance robustness in complex scenes. Evaluated on standard benchmarks, our method achieves a 6% improvement in average precision (AP) and a 10% gain in average recall (AR) over state-of-the-art approaches. The reconstructed footprints exhibit superior geometric regularity and spatial coherence, enabling direct application to large-scale geospatial analytics, including urban planning and post-disaster assessment.
📝 Abstract
In recent years, the number of remote satellites orbiting the Earth has grown significantly, streaming vast amounts of high-resolution visual data to support diverse applications across civil, public, and military domains. Among these applications, the generation and updating of spatial maps of the built environment have become critical due to the extensive coverage and detailed imagery provided by satellites. However, reconstructing spatial maps from satellite imagery is a complex computer vision task, requiring the creation of high-level object representations, such as primitives, to accurately capture the built environment. While the past decade has witnessed remarkable advancements in object detection and representation using visual data, primitives-based object representation remains a persistent challenge in computer vision. Consequently, high-quality spatial maps often rely on labor-intensive and manual processes. This paper introduces a novel deep learning methodology leveraging Graph Convolutional Networks (GCNs) to address these challenges in building footprint reconstruction. The proposed approach enhances performance by incorporating geometric regularity into building boundaries, integrating multi-scale and multi-resolution features, and embedding Attraction Field Maps into the network. These innovations provide a scalable and precise solution for automated building footprint extraction from a single satellite image, paving the way for impactful applications in urban planning, disaster management, and large-scale spatial analysis. Our model, Decoupled-PolyGCN, outperforms existing methods by 6% in AP and 10% in AR, demonstrating its ability to deliver accurate and regularized building footprints across diverse and challenging scenarios.