Communicate-Predict-Act: Evaluating Social Intelligence of Agents

πŸ“… 2026-04-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the lack of an operational definition and systematic evaluation framework for social intelligence in both humans and artificial agents. The authors propose COMPACT (Communicate-Predict-Act), a multi-agent hybrid game framework integrated with fine-grained probes of social dynamics, establishing the first testable, multidimensional assessment system for social intelligence. They introduce novel sociocognitive metrics to quantify capabilities such as action prediction, communicative influence, and strategic reasoning, revealing that influence, transparency, and adaptability better account for social intelligence performance than theory of mind alone. Experiments across eight large language models of varying scales demonstrate that these metrics exhibit high internal consistency (AUC ROC = 0.82), effectively predict win–loss outcomes, and are validated through Elo ratings, behavioral trajectory analysis, and feature importance assessments.

Technology Category

Application Category

πŸ“ Abstract
As large language model (LLM) agents become more prevalent in real world social settings, social intelligence will play an increasingly critical role. But social intelligence is still a poorly defined construct, for humans and artificial agents. We introduce a multiplayer arena of mixed cooperative and competitive social games to study LLM social intelligence. The controllability of LLM based agents enables systematic evaluation, which also supports broader inferences about social intelligence per se. We evaluated eight diverse LLMs (24B to 1T parameters) using a Communicate Predict Act (COMPACT) interaction protocol and fine grained probing of social dynamics. Elo style ratings reveal consistent performance differences across models, but this scalar measure provides only a partial characterization of social intelligence. To address this limitation, we analyze gameplay traces to extract sociocognitive metrics capturing action prediction, communicative influence, strategic reasoning, and tradeoffs under conflicting interests. These sociocognitive metrics exhibit strong intramodel consistency and they reliably predict pairwise agent advantage in game outcomes (AUC ROC = 0.82). Feature importance analysis indicates that surprisingly, influence, transparency, and adaptability are more predictive of success than Theory of Mind inference or deep planning. Together, our results advance a testable, multidimensional conception of social intelligence and provide empirical insights into the capacities that underpin it.
Problem

Research questions and friction points this paper is trying to address.

social intelligence
large language models
multiagent systems
social cognition
evaluation framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

social intelligence
large language model agents
COMPACT protocol
sociocognitive metrics
multi-agent evaluation
πŸ”Ž Similar Papers
No similar papers found.
D
David Shoresh
Edmond and Lily Safra Center for Brain Sciences; The Federmann Center for the Study of Rationality; The Alexander Silberman Institute of Life Science and the Department of Cognitive and Brain Sciences, The Hebrew University of Jerusalem
Sarit Kraus
Sarit Kraus
Professor Of Computer Science, Bar-Ilan University
Artificial IntelligenceHuman agent interactionMulti-agent Systemsmultiagent systems
Yonatan Loewenstein
Yonatan Loewenstein
The Hebrew University of Jerusalem