π€ AI Summary
Existing works predominantly focus on single-polygon representations, neglecting internal structural properties and geometric dependencies among polygon collections. This paper introduces Multipolygon-GNNβthe first end-to-end graph neural network framework designed for multi-part polygonal assemblies. To address these limitations, we: (i) construct a heterogeneous visibility graph that jointly encodes intra- and inter-polygon topological and visibility relationships; (ii) propose a heterogeneous spanning-tree sampling strategy coupled with rotation- and translation-invariant geometric encoding to explicitly capture rigid-body-invariant geometric features; and (iii) jointly optimize graph structure learning and representation learning. Evaluated on five real-world and synthetic datasets, our method significantly improves shape encoding fidelity, architectural pattern classification accuracy, and geospatial question-answering performance. The code and datasets are publicly available.
π Abstract
Polygon representation learning is essential for diverse applications, encompassing tasks such as shape coding, building pattern classification, and geographic question answering. While recent years have seen considerable advancements in this field, much of the focus has been on single polygons, overlooking the intricate inner- and inter-polygonal relationships inherent in multipolygons. To address this gap, our study introduces a comprehensive framework specifically designed for learning representations of polygonal geometries, particularly multipolygons. Central to our approach is the incorporation of a heterogeneous visibility graph, which seamlessly integrates both inner- and inter-polygonal relationships. To enhance computational efficiency and minimize graph redundancy, we implement a heterogeneous spanning tree sampling method. Additionally, we devise a rotation-translation invariant geometric representation, ensuring broader applicability across diverse scenarios. Finally, we introduce Multipolygon-GNN, a novel model tailored to leverage the spatial and semantic heterogeneity inherent in the visibility graph. Experiments on five real-world and synthetic datasets demonstrate its ability to capture informative representations for polygonal geometries. Code and data are available at href{https://github.com/dyu62/PolyGNN}{$github.com/dyu62/PolyGNN$}.