Continual Reinforcement Learning via Autoencoder-Driven Task and New Environment Recognition

📅 2025-05-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Continual reinforcement learning faces challenges in autonomously detecting environmental shifts, retaining knowledge, and retrieving relevant memories—without external task labels. This paper introduces the first end-to-end differentiable policy optimization framework integrating a familiarity-aware autoencoder: the autoencoder learns compact environment representations, and reconstruction error serves as an unsupervised signal for task boundary detection and selective memory retrieval. Crucially, the method operates without explicit task boundary annotations, enabling multi-task sequential learning and robust re-identification of previously encountered environments. Evaluated on standard continual RL benchmarks, it significantly mitigates catastrophic forgetting. Key contributions are: (1) the first familiarity-aware, end-to-end differentiable continual policy optimization; (2) a unified architecture jointly modeling environment identification, change detection, and memory retrieval; and (3) empirical validation of effective continual adaptation under fully unsupervised task delineation—i.e., with no task identity signals whatsoever.

Technology Category

Application Category

📝 Abstract
Continual learning for reinforcement learning agents remains a significant challenge, particularly in preserving and leveraging existing information without an external signal to indicate changes in tasks or environments. In this study, we explore the effectiveness of autoencoders in detecting new tasks and matching observed environments to previously encountered ones. Our approach integrates policy optimization with familiarity autoencoders within an end-to-end continual learning system. This system can recognize and learn new tasks or environments while preserving knowledge from earlier experiences and can selectively retrieve relevant knowledge when re-encountering a known environment. Initial results demonstrate successful continual learning without external signals to indicate task changes or reencounters, showing promise for this methodology.
Problem

Research questions and friction points this paper is trying to address.

Detect new tasks in continual reinforcement learning
Match environments to previously encountered ones
Preserve and leverage knowledge without external signals
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoencoder-driven task and environment recognition
End-to-end continual learning system integration
Selective knowledge retrieval for known environments
🔎 Similar Papers
No similar papers found.
Z
Zeki Doruk Erden
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
D
Donia Gasmi
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Boi Faltings
Boi Faltings
EPFL