🤖 AI Summary
This work investigates the mechanisms by which uncooperative behavior destabilizes or collapses large language model (LLM)-based multi-agent systems. We propose the first game-theoretic taxonomy of uncooperative behaviors and develop a modeling framework integrating state-based evolution with multi-stage dynamic simulation, enabling interpretable modeling of betrayal generation, propagation, and system collapse in resource-management collaboration scenarios. Our approach combines formal game-theoretic modeling, state-evolution-driven multi-agent simulation, and human-in-the-loop evaluation. Experiments show that the framework achieves 96.7% accuracy in identifying uncooperative behaviors; a single act of betrayal triggers system collapse within 1–7 rounds, whereas fully cooperative systems maintain 100% stability and zero resource misuse. The core contribution lies in revealing the critical threshold effect—where individual betrayal induces catastrophic failure of collective cooperation—and establishing a scalable, principled analytical paradigm for studying emergent uncooperative dynamics in LLM agent societies.
📝 Abstract
This paper introduces a novel framework for simulating and analyzing how uncooperative behaviors can destabilize or collapse LLM-based multi-agent systems. Our framework includes two key components: (1) a game theory-based taxonomy of uncooperative agent behaviors, addressing a notable gap in the existing literature; and (2) a structured, multi-stage simulation pipeline that dynamically generates and refines uncooperative behaviors as agents' states evolve. We evaluate the framework via a collaborative resource management setting, measuring system stability using metrics such as survival time and resource overuse rate. Empirically, our framework achieves 96.7% accuracy in generating realistic uncooperative behaviors, validated by human evaluations. Our results reveal a striking contrast: cooperative agents maintain perfect system stability (100% survival over 12 rounds with 0% resource overuse), while any uncooperative behavior can trigger rapid system collapse within 1 to 7 rounds. These findings demonstrate that uncooperative agents can significantly degrade collective outcomes, highlighting the need for designing more resilient multi-agent systems.