Playing repeated games with Large Language Models

📅 2023-05-26

🏛️ arXiv.org

📈 Citations: 116

✨ Influential: 6

career value

216K/year

🤖 AI Summary

This study investigates the differential capacity of large language models (LLMs) to cooperate and coordinate in repeated games, distinguishing between self-interested settings (e.g., repeated Prisoner’s Dilemma) and coordination-intensive settings (e.g., Battle of the Sexes)—both canonical 2×2 games. Method: Using multi-model adversarial simulations, human-subject controlled experiments, and behavioral game-theoretic analysis, we identify a critical limitation: while LLMs exhibit robust performance in self-interested games, they struggle markedly in coordination games requiring common knowledge and belief alignment. To address this, we propose Social Chain-of-Thought (SCoT), a method that explicitly models opponent intentions and jointly reasons about coordinated strategies. Contribution/Results: Empirical evaluation shows SCoT consistently improves collaboration success rates between LLMs (e.g., GPT-4) and humans across diverse coordination tasks, demonstrating strong generalization. This work provides the first systematic characterization of LLMs’ social decision-making boundaries and introduces an interpretable, intervention-aware paradigm for modeling machine social behavior.

📝 Abstract

LLMs are increasingly used in applications where they interact with humans and other agents. We propose to use behavioural game theory to study LLM's cooperation and coordination behaviour. We let different LLMs play finitely repeated $2 imes2$ games with each other, with human-like strategies, and actual human players. Our results show that LLMs perform particularly well at self-interested games like the iterated Prisoner's Dilemma family. However, they behave sub-optimally in games that require coordination, like the Battle of the Sexes. We verify that these behavioural signatures are stable across robustness checks. We additionally show how GPT-4's behaviour can be modulated by providing additional information about its opponent and by using a"social chain-of-thought"(SCoT) strategy. This also leads to better scores and more successful coordination when interacting with human players. These results enrich our understanding of LLM's social behaviour and pave the way for a behavioural game theory for machines.

Problem

Research questions and friction points this paper is trying to address.

Studying LLM cooperation and coordination using game theory

Evaluating LLM performance in self-interested vs. coordination games

Modulating GPT-4 behavior for better human interaction outcomes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using behavioural game theory for LLM interaction analysis

Testing LLMs in repeated 2x2 games with humans

Modulating GPT-4 behaviour with social chain-of-thought

🔎 Similar Papers

Self-playing Adversarial Language Game Enhances LLM Reasoning

2024-04-16arXiv.orgCitations: 11

Netflix

$466,000.00 - $750,000.00

Los Gatos,California,United States of America / Los Angeles,California,United States of America

AI Research Scientist - FAIR Social Intelligence