π€ AI Summary
Reinforcement learning (RL) faces challenges in commercial game NPC deployment, including poor interpretability, difficulty supporting long-horizon strategic planning, and limited multi-task coordination. Method: This paper proposes a tightly integrated RLβBehavior Tree (BT) framework. Leveraging the AMD Schola plugin, we implement a closed-loop training system in Unreal Engine, combining Proximal Policy Optimization (PPO), curriculum learning, and a hierarchical reward structure to enable node-level BT policy optimization and global behavioral coordination. Contribution/Results: To our knowledge, this is the first work to demonstrate end-to-end RL-BT co-training in a complex 3D open-world environment inspired by *The Last of Us*. Experiments show that trained NPCs autonomously execute exploration, close- and long-range combat, and squad-level cooperation; dynamically switch skills; and plan over extended time horizons. The approach significantly improves behavioral adaptability, interpretability, and perceptual realism, establishing a novel paradigm for industrial-scale game AI deployment.
π Abstract
While the rapid advancements in the reinforcement learning (RL) research community have been remarkable, the adoption in commercial video games remains slow. In this paper, we outline common challenges the Game AI community faces when using RL-driven NPCs in practice, and highlight the intersection of RL with traditional behavior trees (BTs) as a crucial juncture to be explored further. Although the BT+RL intersection has been suggested in several research papers, its adoption is rare. We demonstrate the viability of this approach using AMD Schola -- a plugin for training RL agents in Unreal Engine -- by creating multi-task NPCs in a complex 3D environment inspired by the commercial video game ``The Last of Us". We provide detailed methodologies for jointly training RL models with BTs while showcasing various skills.