š¤ AI Summary
This study investigates disparities and interdependencies in spatial intelligence exhibited by large language models (LLMs) across three distinct scales: embodied, urban, and Earth-system. Addressing the lack of unified cross-scale modeling in prior work, we propose the first interdisciplinary, multi-scale framework for spatial intelligence evolutionāintegrating human spatial cognition theory with LLM capability assessment. Methodologically, we unify cognitive science principles, multimodal representation learning, spatial knowledge graphs, and visionālanguageāgeospatial co-modeling to systematically characterize the progression from spatial memory and comprehension to reasoning and decision-making. Our results identify scale-specific bottlenecks, uncover patterns of capability transfer across scales, and establish a theoretical foundationāalongside a reusable research paradigmāfor advancing LLM-driven spatial decision support. (132 words)
š Abstract
Over the past year, the development of large language models (LLMs) has brought spatial intelligence into focus, with much attention on vision-based embodied intelligence. However, spatial intelligence spans a broader range of disciplines and scales, from navigation and urban planning to remote sensing and earth science. What are the differences and connections between spatial intelligence across these fields? In this paper, we first review human spatial cognition and its implications for spatial intelligence in LLMs. We then examine spatial memory, knowledge representations, and abstract reasoning in LLMs, highlighting their roles and connections. Finally, we analyze spatial intelligence across scales -- from embodied to urban and global levels -- following a framework that progresses from spatial memory and understanding to spatial reasoning and intelligence. Through this survey, we aim to provide insights into interdisciplinary spatial intelligence research and inspire future studies.