🤖 AI Summary
Recurrent off-policy deep reinforcement learning (DRL) suffers from high computational overhead, limiting its practical deployment. Method: This paper proposes RISE, a lightweight framework that integrates learnable and fixed encoders synergistically to enable efficient temporal modeling without significant computational overhead; it introduces a simplified recurrent encoding mechanism that decouples temporal modeling from computationally expensive operations. Contribution/Results: RISE achieves plug-and-play integration of recurrent structures into mainstream off-policy algorithms—including DQN and SAC—for the first time. Evaluated on the Atari benchmark, RISE improves human-normalized interquartile mean (IQM) by 35.6% while increasing inference latency by less than 5%, demonstrating the feasibility of low-overhead, high-performance temporal modeling in off-policy DRL.
📝 Abstract
Recurrent off-policy deep reinforcement learning models achieve state-of-the-art performance but are often sidelined due to their high computational demands. In response, we introduce RISE (Recurrent Integration via Simplified Encodings), a novel approach that can leverage recurrent networks in any image-based off-policy RL setting without significant computational overheads via using both learnable and non-learnable encoder layers. When integrating RISE into leading non-recurrent off-policy RL algorithms, we observe a 35.6% human-normalized interquartile mean (IQM) performance improvement across the Atari benchmark. We analyze various implementation strategies to highlight the versatility and potential of our proposed framework.