🤖 AI Summary
This work addresses the limited generalization of existing robot skill models under environmental changes, which often fail to preserve the kinematic characteristics of demonstrated skills. The authors propose a novel skill modeling framework based on Gaussian Processes (GPs) and sparse waypoints, presenting the first systematic investigation into adaptive mechanisms for GP-based skill representations. By integrating trajectory optimization, behavioral cloning, and reinforcement learning, the approach enables large-scale adaptation of skills from a single demonstration while precisely preserving their kinematic profiles—including pose and its first- and second-order analytical derivatives. Extensive simulations and real-world experiments on drawer opening, box pushing, and rod manipulation tasks demonstrate significant improvements over baseline methods, achieving high success rates, strong kinematic fidelity, and low velocity errors.
📝 Abstract
General robot skill adaptation requires expressive representations robust to varying task configurations. While recent learning-based skill adaptation methods refined via Reinforcement Learning (RL), have shown success, existing skill models often lack sufficient representational capacity for anything beyond minor environmental changes. In contrast, Gaussian Process (GP)-based skill modelling provides an expressive representation with useful analytical properties; however, adaptation of GP-based skills remains underexplored. This paper proposes a novel, robust skill adaptation framework that utilises GPs with sparse via-points for compact and expressive modelling. The model considers the trajectory's poses and leverages its first and second analytical derivatives to preserve the skill's kinematic profile. We present three adaptation methods to cater for the variability between initial and observed configurations. Firstly, an optimisation agent that adjusts the path's via-points while preserving the demonstration velocity. Second, a behaviour cloning agent trained to replicate output trajectories from the optimisation agent. Lastly, an RL agent that has learnt to modify via-points whilst maintaining the kinematic profile and enabling online capabilities. Evaluated across three tasks (drawer opening, cube-pushing and bar manipulation) in both simulation and hardware, our proposed methods outperform every benchmark in success rates. Furthermore, the results demonstrate that the GP-based representation enables all three methods to attain high cosine similarity and low velocity magnitude errors, indicating strong preservation of the kinematic profile. Overall, our formulation provides a compact representation capable of adapting to large deviations from a single demonstrated skill.