Evaluating Collective Behaviour of Hundreds of LLM Agents

πŸ“… 2026-02-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the risk that large language model (LLM) agents, when deployed at scale in social dilemmas, may prioritize individual incentives over collective welfare, thereby reinforcing inefficient equilibria. To this end, we propose the first interpretable framework capable of generating and evaluating strategies for hundreds of LLM agents. Our approach enables LLMs to produce inspectable policy algorithms, which are then assessed within a multi-agent simulation incorporating a cultural evolution model to emulate user adoption dynamics. Experiments reveal that newer LLMs are more prone to converging on suboptimal equilibria as cooperation becomes less rewarding or group size increases. We release a comprehensive evaluation suite to establish a new paradigm for studying the societal impacts of LLMs.

Technology Category

Application Category

πŸ“ Abstract
As autonomous agents powered by LLM are increasingly deployed in society, understanding their collective behaviour in social dilemmas becomes critical. We introduce an evaluation framework where LLMs generate strategies encoded as algorithms, enabling inspection prior to deployment and scaling to populations of hundreds of agents -- substantially larger than in previous work. We find that more recent models tend to produce worse societal outcomes compared to older models when agents prioritise individual gain over collective benefits. Using cultural evolution to model user selection of agents, our simulations reveal a significant risk of convergence to poor societal equilibria, particularly when the relative benefit of cooperation diminishes and population sizes increase. We release our code as an evaluation suite for developers to assess the emergent collective behaviour of their models.
Problem

Research questions and friction points this paper is trying to address.

collective behaviour
social dilemmas
LLM agents
societal outcomes
cooperation
Innovation

Methods, ideas, or system contributions that make the work stand out.

collective behaviour
LLM agents
evaluation framework
cultural evolution
social dilemmas
πŸ”Ž Similar Papers
No similar papers found.