π€ AI Summary
Large language models (LLMs) lack reliable geodesic distance awareness and spatial reasoning capabilities, limiting their applicability in real-world scenarios such as point-of-interest (POI) recommendation and itinerary planning. To address this, we propose a lightweight, plug-and-play retrieval-augmented framework that innovatively encodes geographic distances into a retrievable graph structure. By integrating geographic spatial indexing with a distance-aware retrieval-augmented generation (RAG) mechanism, our method dynamically constructs problem-relevant contextual subgraphs, thereby endowing LLMs with a structured, graph-based βworld modelβ of spatial relationships. Crucially, the approach requires no model parameter fine-tuning and enables zero-shot geodesic distance inference over massive, previously unseen location combinations. Experimental results demonstrate substantial improvements in LLM accuracy on distance-related question-answering tasks, effectively overcoming the spatial cognition bottleneck inherent in purely parametric models.
π Abstract
Many real world tasks where Large Language Models (LLMs) can be used require spatial reasoning, like Point of Interest (POI) recommendation and itinerary planning. However, on their own LLMs lack reliable spatial reasoning capabilities, especially about distances. To address this problem, we develop a novel approach, DistRAG, that enables an LLM to retrieve relevant spatial information not explicitly learned during training. Our method encodes the geodesic distances between cities and towns in a graph and retrieves a context subgraph relevant to the question. Using this technique, our method enables an LLM to answer distance-based reasoning questions that it otherwise cannot answer. Given the vast array of possible places an LLM could be asked about, DistRAG offers a flexible first step towards providing a rudimentary `world model' to complement the linguistic knowledge held in LLMs.