DVM: Towards Controllable LLM Agents in Social Deduction Games

📅 2025-01-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language model (LLM)-driven non-player characters (NPCs) in social deduction games (e.g., Werewolf) suffer from inflexible intelligence control, rigid difficulty scaling, and insufficient safety and controllability. Method: We propose a win-rate-controllable LLM agent framework featuring a novel tripartite architecture—Predictor, Decider, and Discussor—integrated with role-aware multi-stage prompting and win-rate-constrained Proximal Policy Optimization (PPO) reinforcement learning. Crucially, target win rate is explicitly embedded as a constraint in the decision-chain reward function. Contribution/Results: This enables fine-grained, continuous adaptation of agent capability—from novice to expert level—while ensuring fairness, safety, and interpretability. Experiments on Werewolf demonstrate significant improvements over baselines, with win-rate control error ≤ ±2%, validating precise, reliable, and controllable agent behavior.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have advanced the capability of game agents in social deduction games (SDGs). These games rely heavily on conversation-driven interactions and require agents to infer, make decisions, and express based on such information. While this progress leads to more sophisticated and strategic non-player characters (NPCs) in SDGs, there exists a need to control the proficiency of these agents. This control not only ensures that NPCs can adapt to varying difficulty levels during gameplay, but also provides insights into the safety and fairness of LLM agents. In this paper, we present DVM, a novel framework for developing controllable LLM agents for SDGs, and demonstrate its implementation on one of the most popular SDGs, Werewolf. DVM comprises three main components: Predictor, Decider, and Discussor. By integrating reinforcement learning with a win rate-constrained decision chain reward mechanism, we enable agents to dynamically adjust their gameplay proficiency to achieve specified win rates. Experiments show that DVM not only outperforms existing methods in the Werewolf game, but also successfully modulates its performance levels to meet predefined win rate targets. These results pave the way for LLM agents' adaptive and balanced gameplay in SDGs, opening new avenues for research in controllable game agents.
Problem

Research questions and friction points this paper is trying to address.

Social Inference Games
Large Language Models
Adaptive Intelligence
Innovation

Methods, ideas, or system contributions that make the work stand out.

DVM
Social Reasoning Games
Adaptive Performance Control
🔎 Similar Papers
No similar papers found.
Z
Zheng Zhang
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Yihuai Lan
Yihuai Lan
Research Engineer@SMU
llm agent
Y
Yangsen Chen
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
L
Lei Wang
Singapore Management University, Singapore
X
Xiang Wang
University of Science and Technology of China, Hefei, China
H
Hao Wang
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China