Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf

📅 2024-05-30
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited capability of large language models (LLMs) in modeling strategic dialogue in complex social deduction games—exemplified by *One Night Ultimate Werewolf* (ONUW)—this paper proposes an RL-instructed language agent framework. The method explicitly formalizes strategic discussion as a learnable tactical selection module and rigorously proves the existence of a Perfect Bayesian Equilibrium (PBE) in ONUW. It integrates proximal policy optimization (PPO), LLM-based agents, multi-round belief updating, and game-theoretic analysis. Experiments demonstrate significant improvements across diverse ONUW configurations: +18.3% win rate and +22.7% role inference accuracy, alongside strong cross-scenario generalization. Key contributions include: (i) the first learnable mapping between strategic communication and belief revision; and (ii) an interpretable, differentiable discussion strategy module, establishing a novel paradigm for open-domain strategic dialogue modeling.

Technology Category

Application Category

📝 Abstract
Communication is a fundamental aspect of human society, facilitating the exchange of information and beliefs among people. Despite the advancements in large language models (LLMs), recent agents built with these often neglect the control over discussion tactics, which are essential in communication scenarios and games. As a variant of the famous communication game Werewolf, One Night Ultimate Werewolf (ONUW) requires players to develop strategic discussion policies due to the potential role changes that increase the uncertainty and complexity of the game. In this work, we first present the existence of the Perfect Bayesian Equilibria (PBEs) in two scenarios of the ONUW game: one with discussion and one without. The results showcase that the discussion greatly changes players' utilities by affecting their beliefs, emphasizing the significance of discussion tactics. Based on the insights obtained from the analyses, we propose an RL-instructed language agent framework, where a discussion policy trained by reinforcement learning (RL) is employed to determine appropriate discussion tactics to adopt. Our experimental results on several ONUW game settings demonstrate the effectiveness and generalizability of our proposed framework. The project page of our paper: $href{https://one-night-ultimate-werewolf.github.io}{one-night-ultimate-werewolf.github.io}$.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Complex Game Environments
Strategic Communication
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning
Strategic Communication
Social Deduction Games
🔎 Similar Papers
No similar papers found.