🤖 AI Summary
Traditional MMO game optimization relies on costly online A/B testing or low-fidelity offline simulation, failing to accurately model player reasoning and causal responses to design interventions.
Method: We propose the first LLM-based generative multi-agent simulation system, jointly training player agents and environment dynamics models via supervised fine-tuning and reinforcement learning on real gameplay logs.
Contribution/Results: Our approach achieves causally consistent, interpretable, and high-fidelity player behavior modeling—surpassing long-standing fidelity bottlenecks in conventional simulation. Experiments demonstrate strong behavioral alignment with real players (average similarity >0.89) and principled, causally grounded responses to game-mechanic interventions. This work establishes an efficient, low-cost, and trustworthy offline optimization framework for MMO numerical balancing and mechanic design.
📝 Abstract
Optimizing numerical systems and mechanism design is crucial for enhancing player experience in Massively Multiplayer Online (MMO) games. Traditional optimization approaches rely on large-scale online experiments or parameter tuning over predefined statistical models, which are costly, time-consuming, and may disrupt player experience. Although simplified offline simulation systems are often adopted as alternatives, their limited fidelity prevents agents from accurately mimicking real player reasoning and reactions to interventions. To address these limitations, we propose a generative agent-based MMO simulation system empowered by Large Language Models (LLMs). By applying Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) on large-scale real player behavioral data, we adapt LLMs from general priors to game-specific domains, enabling realistic and interpretable player decision-making. In parallel, a data-driven environment model trained on real gameplay logs reconstructs dynamic in-game systems. Experiments demonstrate strong consistency with real-world player behaviors and plausible causal responses under interventions, providing a reliable, interpretable, and cost-efficient framework for data-driven numerical design optimization.