🤖 AI Summary
This work investigates whether large language models (LLMs) implicitly encode a linear spatial world model—i.e., Euclidean linear representations of physical positions and object layouts—without explicit spatial training. We introduce a four-step paradigm: (1) synthetic spatial data generation; (2) linear probe-based decoding of positional information; (3) quantification of embedding geometric consistency via triangle inequality satisfaction (>98%); and (4) causal intervention at attention layers (ablation and activation patching). We empirically demonstrate, for the first time across multiple mainstream LLMs, high-fidelity position decoding (R² > 0.92) and confirm that these representations are causally utilized in spatial reasoning. Our contributions are threefold: (1) establishing the existence of implicit linear spatial modeling in LLMs; (2) proposing the first formal framework and causal validation methodology for probing such representations; and (3) revealing that language models possess unsupervised geometric spatial priors—latent knowledge of Euclidean structure emergent from language-only pretraining.
📝 Abstract
Large language models (LLMs) have demonstrated emergent abilities across diverse tasks, raising the question of whether they acquire internal world models. In this work, we investigate whether LLMs implicitly encode linear spatial world models, which we define as linear representations of physical space and object configurations. We introduce a formal framework for spatial world models and assess whether such structure emerges in contextual embeddings. Using a synthetic dataset of object positions, we train probes to decode object positions and evaluate geometric consistency of the underlying space. We further conduct causal interventions to test whether these spatial representations are functionally used by the model. Our results provide empirical evidence that LLMs encode linear spatial world models.