🤖 AI Summary
Large language models (LLMs) exhibit significant limitations in geospatial reasoning—particularly in understanding road networks, distances, and directional relationships—hindering their deployment in time-critical disaster response scenarios. To address this, we propose a self-supervised learning framework tailored for spatial tasks, leveraging OpenStreetMap to construct structured map representations. Our method integrates road infrastructure semantics deeply into both pretraining and fine-tuning stages via QLoRA adapters and 4-bit quantization, enabling efficient, offline deployment without human annotation. It demonstrates cross-city generalization across Los Angeles, Christchurch, and Manila. Experimental results show substantial improvements over strong baselines on road identification, nearest-segment retrieval, and distance-direction estimation—marking the first lightweight, deployable LLM explicitly enhanced for geospatial reasoning.
📝 Abstract
Large Language Models (LLMs) have shown impressive performance across a range of natural language tasks, but remain limited in their ability to reason about geospatial data, particularly road networks, distances, and directions. This gap poses challenges in disaster scenarios, where spatial understanding is critical for tasks such as evacuation planning and resource allocation. In this work, we present RoadMind, a self-supervised framework that enhances the geospatial reasoning capabilities of LLMs using structured data from OpenStreetMap (OSM). Our automated pipeline extracts road infrastructure data for a given city and converts it into multiple supervision formats tailored to key spatial tasks. We pretrain and fine-tune LLMs on these representations using QLoRA adapters and 4-bit quantized models. We evaluate our approach on three disaster-prone cities with varying global representation, Los Angeles, Christchurch, and Manila, across tasks such as road segment identification, nearest road retrieval, and distance/direction estimation. Our results show that models trained via RoadMind significantly outperform strong baselines, including state-of-the-art LLMs equipped with advanced prompt engineering. This demonstrates the potential of structured geospatial data to enhance language models with robust spatial reasoning, enabling more effective offline AI systems for disaster response.