Evaluating Adjective-Noun Compositionality in LLMs: Functional vs Representational Perspectives

📅 2026-02-14

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This study investigates whether large language models genuinely possess compositional linguistic capabilities, focusing specifically on adjective-noun composition tasks. Employing a dual perspective—functional probing tasks and internal representation analysis—the work systematically evaluates models’ performance in compositional understanding. The findings reveal that although certain models can form compositional representations internally, these representations do not consistently translate into successful performance on functional tasks, and results vary significantly across model variants. This discrepancy highlights a notable gap between internal representational structure and observable functional behavior. The study underscores the necessity of integrating both evaluation paradigms to achieve a comprehensive assessment of compositional competence in language models, offering a novel perspective for measuring their capacity for systematic generalization.

Technology Category

Application Category

📝 Abstract

Compositionality is considered central to language abilities. As performant language systems, how do large language models (LLMs) do on compositional tasks? We evaluate adjective-noun compositionality in LLMs using two complementary setups: prompt-based functional assessment and a representational analysis of internal model states. Our results reveal a striking divergence between task performance and internal states. While LLMs reliably develop compositional representations, they fail to translate consistently into functional task success across model variants. Consequently, we highlight the importance of contrastive evaluation for obtaining a more complete understanding of model capabilities.

Problem

Research questions and friction points this paper is trying to address.

compositionality

large language models

adjective-noun composition

functional evaluation

representational analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

compositionality

large language models

functional evaluation