Agents Play Thousands of 3D Video Games

📅 2025-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses key challenges in 3D first-person shooter (FPS) game agents: poor generalization across games, high training costs, and opaque, non-interpretable policies. To this end, we propose PORTAL—a novel framework that reformulates game decision-making as a language generation task. PORTAL leverages large language models (LLMs) to synthesize executable behavior-tree domain-specific language (DSL) policies, enabling zero-shot adaptation across thousands of distinct FPS games and real-time deployment. Its core innovations include a language-guided hybrid policy architecture—integrating symbolic rule nodes with neural components—and a dual-feedback mechanism combining quantitative performance metrics with visual-language model (VLM)-based scene analysis. This design jointly ensures strategic depth, fine-grained control accuracy, and human interpretability. Experiments demonstrate that PORTAL eliminates the need for conventional reinforcement learning training, drastically reduces computational overhead, preserves behavioral diversity, and significantly improves both development efficiency and cross-game generalization capability.

Technology Category

Application Category

📝 Abstract
We present PORTAL, a novel framework for developing artificial intelligence agents capable of playing thousands of 3D video games through language-guided policy generation. By transforming decision-making problems into language modeling tasks, our approach leverages large language models (LLMs) to generate behavior trees represented in domain-specific language (DSL). This method eliminates the computational burden associated with traditional reinforcement learning approaches while preserving strategic depth and rapid adaptability. Our framework introduces a hybrid policy structure that combines rule-based nodes with neural network components, enabling both high-level strategic reasoning and precise low-level control. A dual-feedback mechanism incorporating quantitative game metrics and vision-language model analysis facilitates iterative policy improvement at both tactical and strategic levels. The resulting policies are instantaneously deployable, human-interpretable, and capable of generalizing across diverse gaming environments. Experimental results demonstrate PORTAL's effectiveness across thousands of first-person shooter (FPS) games, showcasing significant improvements in development efficiency, policy generalization, and behavior diversity compared to traditional approaches. PORTAL represents a significant advancement in game AI development, offering a practical solution for creating sophisticated agents that can operate across thousands of commercial video games with minimal development overhead. Experiment results on the 3D video games are best viewed on https://zhongwen.one/projects/portal .
Problem

Research questions and friction points this paper is trying to address.

Develop AI agents for thousands of 3D games.
Transform decision-making into language modeling tasks.
Enable rapid adaptability and strategic depth in gaming.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Language-guided policy generation for AI agents
Hybrid policy structure combining rule-based and neural components
Dual-feedback mechanism for iterative policy improvement
🔎 Similar Papers
No similar papers found.
Zhongwen Xu
Zhongwen Xu
Tencent
Artificial IntelligenceReinforcement LearningLarge Language Models
X
Xianliang Wang
Tencent
S
Siyi Li
Tencent
T
Tao Yu
Tencent
L
Liang Wang
Tencent
Q
Qiang Fu
Tencent
W
Wei Yang
Tencent