The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect

📅 2025-05-21

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This study investigates whether lexical concreteness—a core cognitive dimension—dominates semantic alignment between humans and language models. Method: Leveraging behavioral semantic distance estimation, representational similarity analysis (RSA), and explicit concreteness rating tasks, we conducted controlled ablation experiments that systematically neutralized confounding variables such as word frequency and familiarity across multiple layers of diverse language models. Contribution/Results: We find statistically significant semantic alignment between human subjects and model representations (p < 0.001), with concreteness accounting for substantially more variance than other established psycholinguistic features—including imageability and affective valence. This is the first demonstration that concreteness serves as a latent, cross-system driver of semantic alignment between human cognition and artificial language systems. Our results establish concreteness as a foundational semantic anchor, offering a theoretically grounded principle for understanding and evaluating cross-architectural and cross-cognitive semantic representation.

Technology Category

Application Category

📝 Abstract

The nouns of our language refer to either concrete entities (like a table) or abstract concepts (like justice or love), and cognitive psychology has established that concreteness influences how words are processed. Accordingly, understanding how concreteness is represented in our mind and brain is a central question in psychology, neuroscience, and computational linguistics. While the advent of powerful language models has allowed for quantitative inquiries into the nature of semantic representations, it remains largely underexplored how they represent concreteness. Here, we used behavioral judgments to estimate semantic distances implicitly used by humans, for a set of carefully selected abstract and concrete nouns. Using Representational Similarity Analysis, we find that the implicit representational space of participants and the semantic representations of language models are significantly aligned. We also find that both representational spaces are implicitly aligned to an explicit representation of concreteness, which was obtained from our participants using an additional concreteness rating task. Importantly, using ablation experiments, we demonstrate that the human-to-model alignment is substantially driven by concreteness, but not by other important word characteristics established in psycholinguistics. These results indicate that humans and language models converge on the concreteness dimension, but not on other dimensions.

Problem

Research questions and friction points this paper is trying to address.

How concreteness is represented in human and language model semantics

Alignment between human and model semantic spaces driven by concreteness

Impact of concreteness versus other word characteristics on representations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Behavioral judgments estimate human semantic distances

Representational Similarity Analysis aligns human and model spaces

Ablation experiments show concreteness drives alignment

🔎 Similar Papers

Divergences between Language Models and Human Brains