Cat, Rat, Meow: On the Alignment of Language Model and Human Term-Similarity Judgments

📅 2025-04-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the degree of representational and behavioral alignment between generative language models and human semantic cognition in lexical similarity judgment. We introduce the first word-triple evaluation framework to systematically assess 32 open-source models—spanning diverse scales and training paradigms—using multi-layer Transformer representation extraction, cosine-similarity-based representational alignment analysis, and Spearman correlation for behavioral consistency evaluation. Key findings are: (1) Intermediate layers of small models (e.g., Phi-3, Gemma-2B) achieve human-level representational alignment; (2) Instruction fine-tuning markedly improves behavioral consistency (+28.6% on average) without enhancing representational alignment; (3) Only the largest model (Llama-3-70B) exhibits consistent trends in both representational and behavioral alignment; (4) Alignment patterns are highly architecture- and layer-dependent, exhibiting strong heterogeneity. Our study establishes a novel paradigm and empirical benchmark for evaluating cognitive alignment in language models.

Technology Category

Application Category

📝 Abstract
Small and mid-sized generative language models have gained increasing attention. Their size and availability make them amenable to being analyzed at a behavioral as well as a representational level, allowing investigations of how these levels interact. We evaluate 32 publicly available language models for their representational and behavioral alignment with human similarity judgments on a word triplet task. This provides a novel evaluation setting to probe semantic associations in language beyond common pairwise comparisons. We find that (1) even the representations of small language models can achieve human-level alignment, (2) instruction-tuned model variants can exhibit substantially increased agreement, (3) the pattern of alignment across layers is highly model dependent, and (4) alignment based on models' behavioral responses is highly dependent on model size, matching their representational alignment only for the largest evaluated models.
Problem

Research questions and friction points this paper is trying to address.

Evaluate language models' alignment with human similarity judgments
Probe semantic associations beyond common pairwise comparisons
Assess representational and behavioral alignment across model sizes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates 32 models for human similarity alignment
Uses word triplet task for semantic associations
Analyzes representational and behavioral alignment levels
🔎 Similar Papers
No similar papers found.
Lorenz Linhardt
Lorenz Linhardt
TU Berlin
Machine LearningExplainable AIInterpretabilityNeural NetworksArtificial Intelligence
T
Tom Neuhauser
Machine Learning Group, Technische Universitaet Berlin, Berlin, 10623, Germany; BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, 10623, Germany
L
Lenka Tvetkov'a
Section for Cognitive Systems, DTU Compute, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
Oliver Eberle
Oliver Eberle
TU Berlin
Explainable AIInterpretabilityDeep LearningMachine LearningNLP