Omni Geometry Representation Learning vs Large Language Models for Geospatial Entity Resolution

πŸ“… 2025-08-07
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Geospatial entity resolution (ER) faces challenges in modeling diverse geometric types (points, lines, polygons) and preserving spatial information. This paper proposes Omni, a novel framework addressing these issues: (1) an omni-geometric encoderβ€”the first to jointly embed heterogeneous geometries while retaining fine-grained spatial structure; (2) an attribute affinity mechanism that integrates semantic information from textual fields; and (3) a systematic exploration of large language models (LLMs) for geospatial matching via prompt engineering and few-shot learning. Evaluated on a pure-point dataset, Omni achieves a 12% F1-score improvement over prior methods; on a newly constructed multi-geometry benchmark, it significantly outperforms existing approaches. This work establishes the first unified modeling paradigm integrating geometric diversity and semantic depth for geospatial ER, and empirically demonstrates the viability of LLMs in this domain.

Technology Category

Application Category

πŸ“ Abstract
The development, integration, and maintenance of geospatial databases rely heavily on efficient and accurate matching procedures of Geospatial Entity Resolution (ER). While resolution of points-of-interest (POIs) has been widely addressed, resolution of entities with diverse geometries has been largely overlooked. This is partly due to the lack of a uniform technique for embedding heterogeneous geometries seamlessly into a neural network framework. Existing neural approaches simplify complex geometries to a single point, resulting in significant loss of spatial information. To address this limitation, we propose Omni, a geospatial ER model featuring an omni-geometry encoder. This encoder is capable of embedding point, line, polyline, polygon, and multi-polygon geometries, enabling the model to capture the complex geospatial intricacies of the places being compared. Furthermore, Omni leverages transformer-based pre-trained language models over individual textual attributes of place records in an Attribute Affinity mechanism. The model is rigorously tested on existing point-only datasets and a new diverse-geometry geospatial ER dataset. Omni produces up to 12% (F1) improvement over existing methods. Furthermore, we test the potential of Large Language Models (LLMs) to conduct geospatial ER, experimenting with prompting strategies and learning scenarios, comparing the results of pre-trained language model-based methods with LLMs. Results indicate that LLMs show competitive results.
Problem

Research questions and friction points this paper is trying to address.

Resolving diverse geospatial entities beyond points-of-interest
Embedding heterogeneous geometries into neural networks effectively
Comparing transformer-based models with LLMs for geospatial ER
Innovation

Methods, ideas, or system contributions that make the work stand out.

Omni-geometry encoder for diverse geospatial embeddings
Transformer-based language models for text attributes
Comparison of LLMs with traditional ER methods
πŸ”Ž Similar Papers
No similar papers found.