Forager: a lightweight testbed for continual learning with partial observability in RL

πŸ“… 2026-05-01
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

233K/year
πŸ€– AI Summary
This work addresses the challenge of plasticity loss in continual reinforcement learning caused by partial observability and non-stationary environments, as well as the lack of lightweight, efficient experimental platforms. To this end, the authors propose Foragerβ€”the first benchmark that simultaneously supports partial observability, continual learning, and low memory footprint. Forager enables infinite task streams and long-horizon exploration, and incorporates a state construction mechanism designed to mitigate plasticity degradation. Empirical evaluations reveal that existing continual reinforcement learning agents consistently suffer from performance deterioration within Forager, whereas the proposed state construction strategy significantly enhances long-term adaptability. These findings underscore the limitations of current approaches in complex, dynamic environments.
πŸ“ Abstract
In continual reinforcement learning (CRL), good performance requires never-ending learning, acting, and exploration in a big, partially observable world. Most CRL experiments have focused on loss of plasticity -- the inability to keep learning -- in one-off experiments where some unobservable non-stationarity is added to classic fully observable MDPs. Further, these experiments rarely consider the role of partial observability and the importance of CRL agents that use memory or recurrence. One potential reason for this focus on mitigating loss of plasticity without considering partial observability is that many partially-observable CRL environments are prohibitively expensive. In this paper, we introduce Forager, a light-weight partially-observable CRL environment with a constant memory footprint. We provide a set of experiments and sample tasks demonstrating that Forager is challenging for current CRL agents and yet also allows for in-depth study of those agents. We demonstrate that agents exhibit loss of plasticity, proposed mitigations can help, but that most useful is to leverage state construction. We conclude with a variant of Forager that generates an unending stream of new tasks to learn that clearly highlights the limitations of current CRL agents.
Problem

Research questions and friction points this paper is trying to address.

continual reinforcement learning
partial observability
loss of plasticity
lightweight testbed
memory-based agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

continual reinforcement learning
partial observability
lightweight testbed
loss of plasticity
state construction
πŸ”Ž Similar Papers
No similar papers found.