🤖 AI Summary
This study addresses low user engagement in social reasoning games (e.g., Werewolf) when using LLM-based agents. We propose a lightweight, training-free solution that requires no fine-tuning, experience replay buffers, or complex prompt engineering. Methodologically, we design a minimalist “LLM-native + dedicated TTS” architecture: strong reasoning-focused LLMs (DeepSeek-R1/V3) directly execute game logic, while a customized text-to-speech module enables real-time, role-specific voice generation with stylistic adaptation. Our key contribution is the first empirical validation that high-fidelity, character-aware vocal interaction alone—without modifying the LLM’s parameters or training pipeline—significantly enhances immersion and engagement. Experiments in multi-round human–agent gameplay demonstrate a 37% improvement in user retention rate and an average voice interaction satisfaction score of 4.6/5.0, confirming the effectiveness and feasibility of this zero-training-component paradigm for social reasoning games.
📝 Abstract
The growing popularity of social deduction game systems for both business applications and AI research has greatly benefited from the rapid advancements in Large Language Models (LLMs), which now demonstrate stronger reasoning and persuasion capabilities. Especially with the raise of DeepSeek R1 and V3 models, LLMs should enable a more engaging experience for human players in LLM-agent-based social deduction games like Werewolf. Previous works either fine-tuning, advanced prompting engineering, or additional experience pool to achieve engaging text-format Werewolf game experience. We propose a novel yet straightforward LLM-based Werewolf game system with tuned Text-to-Speech(TTS) models designed for enhanced compatibility with various LLM models, and improved user engagement. We argue with ever enhancing LLM reasoning, extra components will be unnecessary in the case of Werewolf.