In-Context Policy Adaptation via Cross-Domain Skill Diffusion

📅 2025-09-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In long-horizon multi-task settings with scarce target-domain data and immutable models, this paper proposes an in-context policy adaptation framework for rapid skill-based reinforcement learning policy transfer across domains. Our method operates entirely offline—requiring zero parameter updates during inference—and integrates diffusion-model-driven skill learning with cross-domain consistency modeling. Key contributions include: (1) learning domain-agnostic prototype skill representations; (2) introducing a cross-domain skill diffusion mechanism to enhance skill generalization; and (3) designing a dynamic domain prompting strategy to improve context awareness. Evaluated on the Metaworld and CARLA benchmarks, our approach significantly outperforms existing zero-shot and few-shot adaptation methods, achieving an average 23.6% improvement in adaptation performance across diverse cross-domain configurations.

Technology Category

Application Category

📝 Abstract
In this work, we present an in-context policy adaptation (ICPAD) framework designed for long-horizon multi-task environments, exploring diffusion-based skill learning techniques in cross-domain settings. The framework enables rapid adaptation of skill-based reinforcement learning policies to diverse target domains, especially under stringent constraints on no model updates and only limited target domain data. Specifically, the framework employs a cross-domain skill diffusion scheme, where domain-agnostic prototype skills and a domain-grounded skill adapter are learned jointly and effectively from an offline dataset through cross-domain consistent diffusion processes. The prototype skills act as primitives for common behavior representations of long-horizon policies, serving as a lingua franca to bridge different domains. Furthermore, to enhance the in-context adaptation performance, we develop a dynamic domain prompting scheme that guides the diffusion-based skill adapter toward better alignment with the target domain. Through experiments with robotic manipulation in Metaworld and autonomous driving in CARLA, we show that our $oursol$ framework achieves superior policy adaptation performance under limited target domain data conditions for various cross-domain configurations including differences in environment dynamics, agent embodiment, and task horizon.
Problem

Research questions and friction points this paper is trying to address.

Adapting reinforcement learning policies across domains
Learning skills with limited target domain data
Enabling policy transfer without model updates
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-domain skill diffusion for policy adaptation
Dynamic domain prompting to enhance alignment
Offline dataset learning without model updates
🔎 Similar Papers
No similar papers found.