Mind the GAP! The Challenges of Scale in Pixel-based Deep Reinforcement Learning

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pixel-level deep reinforcement learning (DRL) suffers severe performance degradation in large-scale environments, yet its root cause has remained unclear. This paper identifies and empirically validates, for the first time, a critical bottleneck: feature dimension mismatch between convolutional encoders and subsequent fully connected layers. To address this, we propose replacing the conventional flattening operation with global average pooling (GAP), yielding a lightweight, parameter-free bottleneck mitigation strategy. Evaluated on multi-scale benchmarks—including Atari and DeepMind Lab—our approach substantially improves training stability and final performance. It demonstrates strong generalization across diverse DRL algorithms (e.g., Q-learning, PPO) and backbone architectures. Beyond diagnosing a fundamental scalability limitation in DRL, this work establishes a simple, broadly applicable architectural optimization paradigm that enhances representational consistency without increasing model complexity.

Technology Category

Application Category

📝 Abstract
Scaling deep reinforcement learning in pixel-based environments presents a significant challenge, often resulting in diminished performance. While recent works have proposed algorithmic and architectural approaches to address this, the underlying cause of the performance drop remains unclear. In this paper, we identify the connection between the output of the encoder (a stack of convolutional layers) and the ensuing dense layers as the main underlying factor limiting scaling capabilities; we denote this connection as the bottleneck, and we demonstrate that previous approaches implicitly target this bottleneck. As a result of our analyses, we present global average pooling as a simple yet effective way of targeting the bottleneck, thereby avoiding the complexity of earlier approaches.
Problem

Research questions and friction points this paper is trying to address.

Identifying bottleneck in pixel-based deep reinforcement learning scaling
Exploring encoder-dense layer connection as performance limitation factor
Proposing global average pooling to simplify bottleneck mitigation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Identifies encoder-dense layer bottleneck
Proposes global average pooling solution
Simplifies scaling in pixel-based RL
🔎 Similar Papers
No similar papers found.