In-Context Policy Adaptation via Cross-Domain Skill Diffusion

📅 2025-09-04

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

In long-horizon multi-task settings with scarce target-domain data and immutable models, this paper proposes an in-context policy adaptation framework for rapid skill-based reinforcement learning policy transfer across domains. Our method operates entirely offline—requiring zero parameter updates during inference—and integrates diffusion-model-driven skill learning with cross-domain consistency modeling. Key contributions include: (1) learning domain-agnostic prototype skill representations; (2) introducing a cross-domain skill diffusion mechanism to enhance skill generalization; and (3) designing a dynamic domain prompting strategy to improve context awareness. Evaluated on the Metaworld and CARLA benchmarks, our approach significantly outperforms existing zero-shot and few-shot adaptation methods, achieving an average 23.6% improvement in adaptation performance across diverse cross-domain configurations.

Technology Category

Application Category

📝 Abstract

In this work, we present an in-context policy adaptation (ICPAD) framework designed for long-horizon multi-task environments, exploring diffusion-based skill learning techniques in cross-domain settings. The framework enables rapid adaptation of skill-based reinforcement learning policies to diverse target domains, especially under stringent constraints on no model updates and only limited target domain data. Specifically, the framework employs a cross-domain skill diffusion scheme, where domain-agnostic prototype skills and a domain-grounded skill adapter are learned jointly and effectively from an offline dataset through cross-domain consistent diffusion processes. The prototype skills act as primitives for common behavior representations of long-horizon policies, serving as a lingua franca to bridge different domains. Furthermore, to enhance the in-context adaptation performance, we develop a dynamic domain prompting scheme that guides the diffusion-based skill adapter toward better alignment with the target domain. Through experiments with robotic manipulation in Metaworld and autonomous driving in CARLA, we show that our $oursol$ framework achieves superior policy adaptation performance under limited target domain data conditions for various cross-domain configurations including differences in environment dynamics, agent embodiment, and task horizon.

Problem

Research questions and friction points this paper is trying to address.

Adapting reinforcement learning policies across domains

Learning skills with limited target domain data

Enabling policy transfer without model updates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-domain skill diffusion for policy adaptation

Dynamic domain prompting to enhance alignment

Offline dataset learning without model updates

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

PhD GenAI Research Scientist Intern

Databricks

SF Bay Area Hourly Rate$54—$60 USD

San Francisco, CA, USA

Research Scientist Intern, Robotic Control Policy (PhD)