🤖 AI Summary
Learning multiple subtask skills efficiently for complex robotic manipulation remains challenging. Method: This paper proposes a segment-wise skill learning framework based on latent-space modeling. It formalizes human demonstrations as a latent-variable-driven skill segmentation process, reinterprets mixture density networks (MDNs) for the first time as a library of feedback controllers conditioned on latent states, and constructs a unified probabilistic graphical model integrating skill segmentation and control law inference. Theoretical synthesis of linear feedback control and behavioral cloning enables joint optimization of skill identification and robust control in the latent space. Results: Experiments demonstrate significant improvements in task success rate and robustness to observation noise. The method is validated on real robotic platforms, confirming deployment stability and cross-task generalization capability.
📝 Abstract
Manipulation tasks often consist of subtasks, each representing a distinct skill. Mastering these skills is essential for robots, as it enhances their autonomy, efficiency, adaptability, and ability to work in their environment. Learning from demonstrations allows robots to rapidly acquire new skills without starting from scratch, with demonstrations typically sequencing skills to achieve tasks. Behaviour cloning approaches to learning from demonstration commonly rely on mixture density network output heads to predict robot actions. In this work, we first reinterpret the mixture density network as a library of feedback controllers (or skills) conditioned on latent states. This arises from the observation that a one-layer linear network is functionally equivalent to a classical feedback controller, with network weights corresponding to controller gains. We use this insight to derive a probabilistic graphical model that combines these elements, describing the skill acquisition process as segmentation in a latent space, where each skill policy functions as a feedback control law in this latent space. Our approach significantly improves not only task success rate, but also robustness to observation noise when trained with human demonstrations. Our physical robot experiments further show that the induced robustness improves model deployment on robots.