GATA2Floor: Graph attention for floor counting in street-view facades

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
This work addresses the challenge of automatically counting building floors from street-view facade images by proposing GATA2Floor, a novel model that represents facades as graph structures with windows and doors as nodes, enriched with vertical geometric priors. Leveraging multi-head GATv2 layers and an interpretable cross-attention mechanism, the model jointly predicts the number of floors and softly assigns facade elements to implicit floor slots. Innovatively, it integrates self-supervised visual features with vision-language scoring to establish a weakly supervised learning framework that eliminates the need for explicit floor-level annotations. Experimental results demonstrate that the method achieves robust counting performance on irregular facades, validating the efficacy of graph attention-based relational reasoning for facade understanding while substantially reducing reliance on labeled data.
📝 Abstract
Automated analysis of building facades from street-level imagery has great potential for urban analytics, energy assessment, and emergency planning. However, it requires reasoning over spatially arranged elements rather than solely isolated detections. In this work, we model each facade as a graph over window/door detections with a vertical prior on edges. Additionally, we introduce GATA2Floor, a multi-head Graph Attention v2 (GATv2) based model that predicts the global floor count of a building and, via learnable cross-attention queries, softly assigns elements to latent floor slots, yielding interpretable outputs and robustness to irregular designs. To mitigate the lack of labeled datasets, we demonstrate that the proposed graph-based reasoning can be applied without annotations by leveraging a lightweight label-free proposal mechanism based on self-supervised features and vision-language scoring. Our approach demonstrates the value of graph-attention-based relational reasoning for facade understanding.
Problem

Research questions and friction points this paper is trying to address.

floor counting
facade understanding
street-view imagery
graph attention
urban analytics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Attention Network
Facade Understanding
Floor Counting
Cross-Attention
Label-Free Learning
🔎 Similar Papers
No similar papers found.
N
Ngoc Tan Le
ETRO Department, Vrije Universiteit Brussel (VUB), Pleinlaan 2, B-1050 Brussels, Belgium; imec, Kapeldreef 75, B-3001 Leuven, Belgium
T
Tzoulio Chamiti
ETRO Department, Vrije Universiteit Brussel (VUB), Pleinlaan 2, B-1050 Brussels, Belgium; imec, Kapeldreef 75, B-3001 Leuven, Belgium
E
Eirini Papagiannopoulou
ETRO Department, Vrije Universiteit Brussel (VUB), Pleinlaan 2, B-1050 Brussels, Belgium; imec, Kapeldreef 75, B-3001 Leuven, Belgium
Nikos Deligiannis
Nikos Deligiannis
Vrije Universiteit Brussel, imec
Signal ProcessingMachine LearningComputer VisionExplainable AI