🤖 AI Summary
This paper addresses the low training efficiency and poor generalization of deep reinforcement learning (DRL) on consumer-grade hardware. We propose BTR, a lightweight, computationally efficient, and unified framework compatible with both 2D and 3D games. Built upon Rainbow DQN, BTR is the first to systematically integrate six advanced techniques—prioritized experience replay, double Q-learning, noisy networks, distributional Q-learning, multi-step returns, and sequence modeling—while optimizing memory and computational overhead. On a single high-end desktop PC, BTR completes 200 million frames of Atari training in just 12 hours, achieving an Atari-60 human-normalized interquartile mean (IQM) score of 7.4—setting a new state-of-the-art. Crucially, without architectural modification, BTR transfers seamlessly to complex 3D environments—including Super Mario Galaxy and Mario Kart—enabling end-to-end training. The paper provides comprehensive ablation studies and releases open-source code.
📝 Abstract
Rainbow Deep Q-Network (DQN) demonstrated combining multiple independent enhancements could significantly boost a reinforcement learning (RL) agent's performance. In this paper, we present"Beyond The Rainbow"(BTR), a novel algorithm that integrates six improvements from across the RL literature to Rainbow DQN, establishing a new state-of-the-art for RL using a desktop PC, with a human-normalized interquartile mean (IQM) of 7.4 on Atari-60. Beyond Atari, we demonstrate BTR's capability to handle complex 3D games, successfully training agents to play Super Mario Galaxy, Mario Kart, and Mortal Kombat with minimal algorithmic changes. Designing BTR with computational efficiency in mind, agents can be trained using a high-end desktop PC on 200 million Atari frames within 12 hours. Additionally, we conduct detailed ablation studies of each component, analyzing the performance and impact using numerous measures. Code is available at https://github.com/VIPTankz/BTR.