Can LLMs Learn to Map the World from Local Descriptions?

📅 2025-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether large language models (LLMs) can spontaneously construct a consistent global spatial representation—encompassing spatial perception (i.e., inferring global layouts) and spatial navigation (i.e., learning road networks and planning paths)—solely from local, relative, coordinate-free spatial descriptions (e.g., “A is east of B”). We propose three methodological components: zero-shot/few-shot spatial relation reasoning, trajectory-text-driven road network modeling, and latent geometric alignment analysis. Our study provides the first systematic empirical validation that LLMs can emergently acquire global spatial representations from unlabeled, fragmented spatial utterances: they generalize relationships among unseen points of interest (POIs) in simulated cities, their latent embeddings exhibit statistically significant alignment with ground-truth geographic distributions (p < 0.01), and they support high-accuracy end-to-end path planning. These findings reveal LLMs’ implicit capacity to model real-world spatial structure, establishing a novel paradigm for embodied AI and geographic reasoning.

Technology Category

Application Category

📝 Abstract
Recent advances in Large Language Models (LLMs) have demonstrated strong capabilities in tasks such as code and mathematics. However, their potential to internalize structured spatial knowledge remains underexplored. This study investigates whether LLMs, grounded in locally relative human observations, can construct coherent global spatial cognition by integrating fragmented relational descriptions. We focus on two core aspects of spatial cognition: spatial perception, where models infer consistent global layouts from local positional relationships, and spatial navigation, where models learn road connectivity from trajectory data and plan optimal paths between unconnected locations. Experiments conducted in a simulated urban environment demonstrate that LLMs not only generalize to unseen spatial relationships between points of interest (POIs) but also exhibit latent representations aligned with real-world spatial distributions. Furthermore, LLMs can learn road connectivity from trajectory descriptions, enabling accurate path planning and dynamic spatial awareness during navigation.
Problem

Research questions and friction points this paper is trying to address.

Can LLMs integrate local descriptions into global spatial cognition?
Do LLMs infer global layouts from local positional relationships?
Can LLMs learn road connectivity for optimal path planning?
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs infer global layouts from local relationships
LLMs learn road connectivity from trajectory data
LLMs exhibit real-world aligned spatial representations
🔎 Similar Papers
No similar papers found.
S
Sirui Xia
Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University
Aili Chen
Aili Chen
Fudan University
Large Language ModelReasoning and PlanningLanguage AgentLLM Personalization
X
Xintao Wang
Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University
Tinghui Zhu
Tinghui Zhu
University of California, Davis
Natural Language ProcessingVision-Language Models
Yikai Zhang
Yikai Zhang
Fudan university
Natural Language ProcessingAutonomous Agent
Jiangjie Chen
Jiangjie Chen
ByteDance Seed
NLPMachine ReasoningLarge Language ModelsAutonomous Agent
Y
Yanghua Xiao
Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University