🤖 AI Summary
This study addresses the limited spatiotemporal reasoning capability of large language models (LLMs) in understanding gridded geospatial data. We systematically compare structured prompt engineering against supervised fine-tuning (SFT). Our method introduces a geospatial-semantic–aware encoding scheme for gridded spatiotemporal data, constructs user-assistant interactive training data, and performs domain-adaptive fine-tuning. Through quantitative experiments—first of their kind—we reveal the fundamental limitations of zero-shot structured prompting on complex spatiotemporal reasoning tasks. We further demonstrate that domain-specific fine-tuning substantially enhances LLMs’ joint modeling capacity for geographic entities, temporal relations, and spatial topologies. On geographic question answering and spatiotemporal inference tasks, the fine-tuned model achieves an average accuracy improvement of 32.7% over zero-shot prompting, with markedly improved generalization and robustness.
📝 Abstract
This paper presents a comparative study of large language models (LLMs) in interpreting grid-structured geospatial data. We evaluate the performance of a base model through structured prompting and contrast it with a fine-tuned variant trained on a dataset of user-assistant interactions. Our results highlight the strengths and limitations of zero-shot prompting and demonstrate the benefits of fine-tuning for structured geospatial and temporal reasoning.