GeoAggregator: An Efficient Transformer Model for Geo-Spatial Tabular Data

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor scalability of deep learning models for geospatial tabular data and the difficulty in jointly modeling spatial autocorrelation and heterogeneity, this paper proposes a lightweight geographically aware Transformer architecture. Methodologically, it introduces three key innovations: (1) a novel Gaussian-biased local attention mechanism that explicitly encodes spatial dependencies while reducing computational complexity; (2) Cartesian-product attention to efficiently capture multidimensional spatial interactions; and (3) a global position-aware geographic embedding that integrates spatial statistical priors. Evaluated on multiple synthetic and real-world geospatial datasets, the model consistently ranks among the top three in predictive performance, reduces parameter count by over 40%, and achieves a 2.3× speedup in inference latency—outperforming both XGBoost and state-of-the-art geospatial deep learning models.

Technology Category

Application Category

📝 Abstract
Modeling geospatial tabular data with deep learning has become a promising alternative to traditional statistical and machine learning approaches. However, existing deep learning models often face challenges related to scalability and flexibility as datasets grow. To this end, this paper introduces GeoAggregator, an efficient and lightweight algorithm based on transformer architecture designed specifically for geospatial tabular data modeling. GeoAggregators explicitly account for spatial autocorrelation and spatial heterogeneity through Gaussian-biased local attention and global positional awareness. Additionally, we introduce a new attention mechanism that uses the Cartesian product to manage the size of the model while maintaining strong expressive power. We benchmark GeoAggregator against spatial statistical models, XGBoost, and several state-of-the-art geospatial deep learning methods using both synthetic and empirical geospatial datasets. The results demonstrate that GeoAggregators achieve the best or second-best performance compared to their competitors on nearly all datasets. GeoAggregator's efficiency is underscored by its reduced model size, making it both scalable and lightweight. Moreover, ablation experiments offer insights into the effectiveness of the Gaussian bias and Cartesian attention mechanism, providing recommendations for further optimizing the GeoAggregator's performance.
Problem

Research questions and friction points this paper is trying to address.

Efficient modeling of geospatial tabular data
Addressing scalability in deep learning models
Incorporating spatial autocorrelation and heterogeneity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer architecture for geospatial data
Gaussian-biased local attention mechanism
Cartesian product attention for model efficiency