The Subtle Art of Defection: Understanding Uncooperative Behaviors in LLM based Multi-Agent Systems

📅 2025-11-19

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work investigates the mechanisms by which uncooperative behavior destabilizes or collapses large language model (LLM)-based multi-agent systems. We propose the first game-theoretic taxonomy of uncooperative behaviors and develop a modeling framework integrating state-based evolution with multi-stage dynamic simulation, enabling interpretable modeling of betrayal generation, propagation, and system collapse in resource-management collaboration scenarios. Our approach combines formal game-theoretic modeling, state-evolution-driven multi-agent simulation, and human-in-the-loop evaluation. Experiments show that the framework achieves 96.7% accuracy in identifying uncooperative behaviors; a single act of betrayal triggers system collapse within 1–7 rounds, whereas fully cooperative systems maintain 100% stability and zero resource misuse. The core contribution lies in revealing the critical threshold effect—where individual betrayal induces catastrophic failure of collective cooperation—and establishing a scalable, principled analytical paradigm for studying emergent uncooperative dynamics in LLM agent societies.

Technology Category

Application Category

📝 Abstract

This paper introduces a novel framework for simulating and analyzing how uncooperative behaviors can destabilize or collapse LLM-based multi-agent systems. Our framework includes two key components: (1) a game theory-based taxonomy of uncooperative agent behaviors, addressing a notable gap in the existing literature; and (2) a structured, multi-stage simulation pipeline that dynamically generates and refines uncooperative behaviors as agents' states evolve. We evaluate the framework via a collaborative resource management setting, measuring system stability using metrics such as survival time and resource overuse rate. Empirically, our framework achieves 96.7% accuracy in generating realistic uncooperative behaviors, validated by human evaluations. Our results reveal a striking contrast: cooperative agents maintain perfect system stability (100% survival over 12 rounds with 0% resource overuse), while any uncooperative behavior can trigger rapid system collapse within 1 to 7 rounds. These findings demonstrate that uncooperative agents can significantly degrade collective outcomes, highlighting the need for designing more resilient multi-agent systems.

Problem

Research questions and friction points this paper is trying to address.

Simulating uncooperative behaviors in LLM multi-agent systems

Analyzing how defection destabilizes collaborative resource management

Developing a taxonomy and pipeline for agent behavior generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Game theory taxonomy for uncooperative agent behaviors

Dynamic multi-stage simulation pipeline for behavior generation

Framework achieving 96.7% accuracy in behavior generation

🔎 Similar Papers

Deception in Reinforced Autonomous Agents