Policy Compatible Skill Incremental Learning via Lazy Learning Interface

📅 2025-09-24

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

In skill incremental learning (SIL), declining compatibility between policies and evolving skills impedes skill reuse and limits generalization. To address this, we propose SIL-C, a novel framework centered on a bidirectional lazy learning mapping mechanism that dynamically aligns subtask and skill spaces, enabling efficient policy reuse without retraining. SIL-C introduces, for the first time, a lazy learning interface to rigorously preserve compatibility across skill updates. It further integrates trajectory-distribution-similarity-driven dynamic skill selection, skill-space decoding, and hierarchical policy decomposition. Extensive evaluation across diverse incremental skill acquisition scenarios demonstrates that SIL-C significantly improves downstream task performance and sample efficiency while strictly maintaining compatibility between policies and continuously evolving skills.

Technology Category

Application Category

📝 Abstract

Skill Incremental Learning (SIL) is the process by which an embodied agent expands and refines its skill set over time by leveraging experience gained through interaction with its environment or by the integration of additional data. SIL facilitates efficient acquisition of hierarchical policies grounded in reusable skills for downstream tasks. However, as the skill repertoire evolves, it can disrupt compatibility with existing skill-based policies, limiting their reusability and generalization. In this work, we propose SIL-C, a novel framework that ensures skill-policy compatibility, allowing improvements in incrementally learned skills to enhance the performance of downstream policies without requiring policy re-training or structural adaptation. SIL-C employs a bilateral lazy learning-based mapping technique to dynamically align the subtask space referenced by policies with the skill space decoded into agent behaviors. This enables each subtask, derived from the policy's decomposition of a complex task, to be executed by selecting an appropriate skill based on trajectory distribution similarity. We evaluate SIL-C across diverse SIL scenarios and demonstrate that it maintains compatibility between evolving skills and downstream policies while ensuring efficiency throughout the learning process.

Problem

Research questions and friction points this paper is trying to address.

Maintaining compatibility between evolving skills and existing policies

Preventing disruption of reusable skill-based policies during skill updates

Enabling skill improvements without requiring policy retraining or adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lazy learning interface ensures skill-policy compatibility

Bilateral mapping aligns subtask space with skill space

Skill selection based on trajectory distribution similarity

🔎 Similar Papers

No similar papers found.