Extracting polygonal footprints in off-nadir images with Segment Anything Model

📅 2024-08-16

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

To address low accuracy and poor generalization in building footprint extraction from oblique remote sensing imagery, this paper proposes an end-to-end promptable framework for direct polygonal footprint prediction, abandoning the conventional segmentation-plus-postprocessing paradigm. Key contributions include: (1) a Self-Offset Attention (SOFA) mechanism that explicitly models geometric distortions under oblique viewing angles; (2) a Multi-level Information Fusion System (MISS) enabling scale-robust modeling—from single-story buildings to skyscrapers; and (3) a promptable learning and multi-source mask joint modeling framework built upon the SAM architecture. The method directly outputs high-fidelity vectorized building contours without postprocessing. Extensive experiments on BONAI, OmniCity-view3, and Huizhou datasets demonstrate substantial improvements over state-of-the-art methods, achieving superior accuracy, strong cross-scene generalization, and practical deployability.

Technology Category

Application Category

📝 Abstract

Building Footprint Extraction (BFE) from off-nadir aerial images often involves roof segmentation and offset prediction to adjust roof boundaries to the building footprint. However, this multi-stage approach typically produces low-quality results, limiting its applicability in real-world data production. To address this issue, we present OBMv2, an end-to-end and promptable model for polygonal footprint prediction. Unlike its predecessor OBM, OBMv2 introduces a novel Self Offset Attention (SOFA) mechanism that improves performance across diverse building types, from bungalows to skyscrapers, enabling end-to-end footprint prediction without post-processing. Additionally, we propose a Multi-level Information System (MISS) to effectively leverage roof masks, building masks, and offsets for accurate footprint prediction. We evaluate OBMv2 on the BONAI and OmniCity-view3 datasets and demonstrate its generalization on the Huizhou test set. The code will be available at https://github.com/likaiucas/OBMv2.

Problem

Research questions and friction points this paper is trying to address.

Extracting precise polygonal building footprints from off-nadir images

Overcoming geometric complexities in off-nadir viewing angles

Improving boundary accuracy without external post-processing steps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Direct polygonal output without post-processing

High-Quality Mask Prompter for precise roofs

Self Offset Attention for accuracy improvement

🔎 Similar Papers

No similar papers found.