🤖 AI Summary
This study addresses the instability and unpredictability of strategies exhibited by large language models (LLMs) in multi-agent strategic interactions, often stemming from context dependence. It systematically introduces low-cost pre-play communication—commonly known as "cheap talk"—into repeated prisoner’s dilemma settings to evaluate its impact on strategic stability. Using LOWESS regression to model cooperation trajectories and employing bootstrap resampling with nonparametric inference for statistical analysis, the work demonstrates that cheap talk significantly reduces noise in strategic trajectories and enhances behavioral predictability. These effects remain robust across diverse prompting and decoding strategies, with the most pronounced improvements observed in highly volatile models. Only a minimal number of contextual configurations exhibit slight reductions in stability.
📝 Abstract
Large Language Models (LLMs) often exhibit pronounced context-dependent variability that undermines predictable multi-agent behavior in tasks requiring strategic thinking. Focusing on models that range from 7 to 9 billion parameters in size engaged in a ten-round repeated Prisoner's Dilemma, we evaluate whether short, costless pre-play messages emulating the cheap-talk paradigm affect strategic stability. Our analysis uses simulation-level bootstrap resampling and nonparametric inference to compare cooperation trajectories fitted with LOWESS regression across both the messaging and the no-messaging condition. We demonstrate consistent reductions in trajectory noise across a majority of the model-context pairings being studied. The stabilizing effect persists across multiple prompt variants and decoding regimes, though its magnitude depends on model choice and contextual framing, with models displaying higher baseline volatility gaining the most. While communication rarely produces harmful instability, we document a few context-specific exceptions and identify the limited domains in which communication harms stability. These findings position cheap-talk style communication as a low-cost, practical tool for improving the predictability and reliability of strategic behavior in multi-agent LLM systems.