đ¤ AI Summary
This study investigates whether the behavioral outputs of large language models (LLMs) reflect the semantic geometry of their internal hidden states. Using psycholinguistic experimentsâspecifically forced-choice and free-association tasksâthe authors collect behavioral data from eight instruction-tuned Transformer models across 5,000 words, constructing behavioral similarity matrices. These are then compared with hidden-state representations at each layer via representational similarity analysis. The work presents the first systematic comparison of the two behavioral paradigms in recovering LLMsâ internal semantic geometry, revealing that forced-choice significantly outperforms free association. Notably, behavioral similarities derived from forced-choice effectively predict hidden-state similarities for unseen words, surpassing lexical baselines such as FastText and BERT, as well as cross-model consensus, thereby demonstrating a strong alignment between behavioral data and internal representations.
đ Abstract
We investigate the extent to which an LLM's hidden-state geometry can be recovered from its behavior in psycholinguistic experiments. Across eight instruction-tuned transformer models, we run two experimental paradigms -- similarity-based forced choice and free association -- over a shared 5,000-word vocabulary, collecting 17.5M+ trials to build behavior-based similarity matrices. Using representational similarity analysis, we compare behavioral geometries to layerwise hidden-state similarity and benchmark against FastText, BERT, and cross-model consensus. We find that forced-choice behavior aligns substantially more with hidden-state geometry than free association. In a held-out-words regression, behavioral similarity (especially forced choice) predicts unseen hidden-state similarities beyond lexical baselines and cross-model consensus, indicating that behavior-only measurements retain recoverable information about internal semantic geometry. Finally, we discuss implications for the ability of behavioral tasks to uncover hidden cognitive states.