🤖 AI Summary
This study investigates how large language models (LLMs) adjust their cooperative strategies in the iterated prisoner’s dilemma under the influence of payoff incentives and linguistic context. By designing a multilingual game-theoretic experiment with scaled payoffs and employing a supervised classifier to map LLM behaviors to canonical repeated-game strategies, the work systematically analyzes strategic patterns across models and languages. It reveals, for the first time, that the framing effect of language can rival or even surpass architectural differences in shaping LLMs’ propensity to cooperate. The authors propose a unified auditing framework grounded in classical game theory and demonstrate that LLMs consistently exhibit incentive-sensitive conditional cooperation, with language significantly modulating their strategic choices—providing empirical foundations for safe multi-agent systems and AI governance.
📝 Abstract
As LLMs increasingly act as autonomous agents in interactive and multi-agent settings, understanding their strategic behavior is critical for safety, coordination, and AI-driven social and economic systems. We investigate how payoff magnitude and linguistic context shape LLM strategies in repeated social dilemmas, using a payoff-scaled Prisoner's Dilemma to isolate sensitivity to incentive strength. Across models and languages, we observe consistent behavioral patterns, including incentive-sensitive conditional strategies and cross-linguistic divergence. To interpret these dynamics, we train supervised classifiers on canonical repeated-game strategies and apply them to LLM decisions, revealing systematic, model- and language-dependent behavioral intentions, with linguistic framing sometimes matching or exceeding architectural effects. Our results provide a unified framework for auditing LLMs as strategic agents and highlight cooperation biases with direct implications for AI governance and multi-agent system design.