🤖 AI Summary
This paper addresses the challenge of continual learning (CL) under dynamic action spaces—where an agent’s actionable capabilities evolve across tasks, leading to policy generalization degradation—and introduces the novel paradigm of Dynamic Capability Continual Learning (DC-CL). Methodologically, it constructs an action representation space and employs an adaptive encoder-decoder architecture to decouple policy learning from task-specific action sets; integrates action-space embedding mapping with incremental fine-tuning to dynamically balance stability and plasticity; and models capability evolution inspired by cortical functional principles. Contributions include: (i) the first formal definition of the DC-CL problem; (ii) an action-adaptive representation transfer mechanism; and (iii) the release of CL-DC, the first benchmark suite for DC-CL. Evaluated across three heterogeneous environments, the approach achieves an average performance gain of 23.6% over state-of-the-art CL methods.
📝 Abstract
Continual Learning (CL) is a powerful tool that enables agents to learn a sequence of tasks, accumulating knowledge learned in the past and using it for problem-solving or future task learning. However, existing CL methods often assume that the agent's capabilities remain static within dynamic environments, which doesn't reflect real-world scenarios where capabilities dynamically change. This paper introduces a new and realistic problem: Continual Learning with Dynamic Capabilities (CL-DC), posing a significant challenge for CL agents: How can policy generalization across different action spaces be achieved? Inspired by the cortical functions, we propose an Action-Adaptive Continual Learning framework (AACL) to address this challenge. Our framework decouples the agent's policy from the specific action space by building an action representation space. For a new action space, the encoder-decoder of action representations is adaptively fine-tuned to maintain a balance between stability and plasticity. Furthermore, we release a benchmark based on three environments to validate the effectiveness of methods for CL-DC. Experimental results demonstrate that our framework outperforms popular methods by generalizing the policy across action spaces.