Theory Trace Card: Theory-Driven Socio-Cognitive Evaluation of LLMs

📅 2026-01-05

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Current evaluations of social cognition in large language models often lack a clear theoretical foundation, leading to high scores that may not reflect genuine capabilities and thereby producing systematic validity illusions. To address this issue, this work proposes the Theory Tracing Card (TTC)—a lightweight, structured documentation tool that explicitly foregrounds theory as a core component of evaluation design. The TTC integrates componential analysis of target competencies, operationalization of evaluation tasks, and mapping of validity chains to transparently articulate the underlying theoretical assumptions, dimensions of ability assessed, implementation strategies, and boundaries of applicability. This approach substantially enhances the interpretability, reusability, and scientific rigor of social cognition evaluations, effectively bridging the gap between theory and task design while mitigating unwarranted overgeneralization of results.

Technology Category

Application Category

📝 Abstract

Socio-cognitive benchmarks for large language models (LLMs) often fail to predict real-world behavior, even when models achieve high benchmark scores. Prior work has attributed this evaluation-deployment gap to problems of measurement and validity. While these critiques are insightful, we argue that they overlook a more fundamental issue: many socio-cognitive evaluations proceed without an explicit theoretical specification of the target capability, leaving the assumptions linking task performance to competence implicit. Without this theoretical grounding, benchmarks that exercise only narrow subsets of a capability are routinely misinterpreted as evidence of broad competence: a gap that creates a systemic validity illusion by masking the failure to evaluate the capability's other essential dimensions. To address this gap, we make two contributions. First, we diagnose and formalize this theory gap as a foundational failure that undermines measurement and enables systematic overgeneralization of benchmark results. Second, we introduce the Theory Trace Card (TTC), a lightweight documentation artifact designed to accompany socio-cognitive evaluations, which explicitly outlines the theoretical basis of an evaluation, the components of the target capability it exercises, its operationalization, and its limitations. We argue that TTCs enhance the interpretability and reuse of socio-cognitive evaluations by making explicit the full validity chain, which links theory, task operationalization, scoring, and limitations, without modifying benchmarks or requiring agreement on a single theory.

Problem

Research questions and friction points this paper is trying to address.

socio-cognitive evaluation

large language models

validity illusion

theoretical grounding

benchmark overgeneralization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Theory Trace Card

socio-cognitive evaluation

validity illusion