🤖 AI Summary
To address limited representational capacity of encoders in pixel-input model-free reinforcement learning and constrained performance gains on the Atari-57 benchmark, this paper proposes the Hadamard Max-Pooling Encoder (HMPE)—a novel encoder architecture. HMPE uniquely integrates Hadamard product operations with parallel GELU-activated hidden layers and tightly couples max-pooling with end-to-end pixel encoding, enabling efficient feature extraction within the PQN algorithmic framework. Evaluated on Atari-57, HMPE-PQN achieves state-of-the-art performance: its mean human-normalized score improves by 80% over the original PQN and significantly surpasses Rainbow-DQN, without requiring hyperparameter tuning. The core contribution lies in an architectural innovation for nonlinear modeling—specifically, the synergistic fusion of multiplicative interactions, gated nonlinearities, and hierarchical spatial pooling—which establishes a scalable, high-performance encoding paradigm for pixel-based model-free RL.
📝 Abstract
Neural network architectures have a large impact in machine learning. In reinforcement learning, network architectures have remained notably simple, as changes often lead to small gains in performance. This work introduces a novel encoder architecture for pixel-based model-free reinforcement learning. The Hadamax ( extbf{Hada}mard extbf{max}-pooling) encoder achieves state-of-the-art performance by max-pooling Hadamard products between GELU-activated parallel hidden layers. Based on the recent PQN algorithm, the Hadamax encoder achieves state-of-the-art model-free performance in the Atari-57 benchmark. Specifically, without applying any algorithmic hyperparameter modifications, Hadamax-PQN achieves an 80% performance gain over vanilla PQN and significantly surpasses Rainbow-DQN. For reproducibility, the full code is available on href{https://github.com/Jacobkooi/Hadamax}{GitHub}.