🤖 AI Summary
Task forgetting in multi-task reinforcement learning (RL) undermines continual learning, yet its underlying dynamics and implications for curriculum design remain poorly understood. Method: We empirically characterize task forgetting and discover that its temporal decay closely follows an exponential curve—mirroring human cognitive forgetting—and identify asymmetric inter-task learning/retention patterns as the fundamental cause of failure for existing performance- or memory-based curriculum strategies. We systematically adapt and evaluate cognitive-inspired spaced-repetition methods (e.g., Leitner, SuperMemo), revealing their poor generalization to RL settings. Building on these insights, we propose a novel task-scheduling paradigm explicitly designed for continual RL. Contribution/Results: Our work establishes the first cross-domain cognitive foundation for RL continual learning, provides a reproducible quantitative model of task forgetting, and introduces an empirical benchmark for evaluating forgetting-aware schedulers—shifting task scheduling from heuristic-driven to mechanism-driven design.
📝 Abstract
Reinforcement learning (RL) agents can forget tasks they have previously been trained on. There is a rich body of work on such forgetting effects in humans. Therefore we look for commonalities in the forgetting behavior of humans and RL agents across tasks and test the viability of forgetting prevention measures from learning theory in RL. We find that in many cases, RL agents exhibit forgetting curves similar to those of humans. Methods like Leitner or SuperMemo have been shown to be effective at counteracting human forgetting, but we demonstrate they do not transfer as well to RL. We identify a likely cause: asymmetrical learning and retention patterns between tasks that cannot be captured by retention-based or performance-based curriculum strategies.