Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games

📅 2025-06-29

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This study investigates how multi-agent large language models (LLMs) sustain cooperation in social dilemmas, specifically examining the trade-off between individual incentives and collective welfare under high-cost sanctioning. Using a repeated public goods game framework grounded in behavioral economics—and augmented with institutional selection mechanisms—we conduct multi-round interactive experiments across diverse LLMs. We find that models with enhanced reasoning capabilities exhibit increased free-riding behavior and significantly lower cooperation stability compared to baseline models. From these results, we identify four transferable behavioral patterns. These findings challenge the prevailing assumption that stronger reasoning inherently fosters cooperation, revealing that current LLM capability trajectories may inadvertently erode social coordination capacity. The work provides critical empirical evidence and a novel theoretical lens for designing robust, safe, and socially aligned AI agent systems.

Technology Category

Application Category

📝 Abstract

As large language models (LLMs) are increasingly deployed as autonomous agents, understanding their cooperation and social mechanisms is becoming increasingly important. In particular, how LLMs balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment. In this paper, we examine the challenge of costly sanctioning in multi-agent LLM systems, where an agent must decide whether to invest its own resources to incentivize cooperation or penalize defection. To study this, we adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas over repeated interactions. Our analysis reveals four distinct behavioral patterns among models: some consistently establish and sustain high levels of cooperation, others fluctuate between engagement and disengagement, some gradually decline in cooperative behavior over time, and others rigidly follow fixed strategies regardless of outcomes. Surprisingly, we find that reasoning LLMs, such as the o1 series, struggle significantly with cooperation, whereas some traditional LLMs consistently achieve high levels of cooperation. These findings suggest that the current approach to improving LLMs, which focuses on enhancing their reasoning capabilities, does not necessarily lead to cooperation, providing valuable insights for deploying LLM agents in environments that require sustained collaboration. Our code is available at https://github.com/davidguzmanp/SanctSim

Problem

Research questions and friction points this paper is trying to address.

How LLMs balance self-interest and collective well-being

Challenges of costly sanctioning in multi-agent LLM systems

Impact of reasoning capabilities on LLM cooperation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapt public goods game for LLM behavior analysis

Identify four distinct LLM cooperation patterns

Compare reasoning vs traditional LLM cooperation performance

🔎 Similar Papers

Self-playing Adversarial Language Game Enhances LLM Reasoning