Pretrained Bayesian Non-parametric Knowledge Prior in Robotic Long-Horizon Reinforcement Learning

📅 2025-03-27

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

To address the rigidity and insufficient diversity of skill priors in long-horizon robotic reinforcement learning, this paper proposes a Bayesian nonparametric framework for pretraining skill priors. Unlike conventional fixed-structure priors (e.g., single Gaussian), our approach employs a Dirichlet process mixture model coupled with a birth-merge heuristic to automatically infer the unknown number of primitive motor skills, enabling explicit skill localization and interpretable control. We further construct a skill embedding space that unifies reinforcement learning policy distillation and transfer learning. Experiments on complex long-horizon manipulation tasks demonstrate substantial improvements in task success rate and sample efficiency, alongside effective cross-task skill reuse. All code, datasets, and demonstration videos are publicly released.

Technology Category

Application Category

📝 Abstract

Reinforcement learning (RL) methods typically learn new tasks from scratch, often disregarding prior knowledge that could accelerate the learning process. While some methods incorporate previously learned skills, they usually rely on a fixed structure, such as a single Gaussian distribution, to define skill priors. This rigid assumption can restrict the diversity and flexibility of skills, particularly in complex, long-horizon tasks. In this work, we introduce a method that models potential primitive skill motions as having non-parametric properties with an unknown number of underlying features. We utilize a Bayesian non-parametric model, specifically Dirichlet Process Mixtures, enhanced with birth and merge heuristics, to pre-train a skill prior that effectively captures the diverse nature of skills. Additionally, the learned skills are explicitly trackable within the prior space, enhancing interpretability and control. By integrating this flexible skill prior into an RL framework, our approach surpasses existing methods in long-horizon manipulation tasks, enabling more efficient skill transfer and task success in complex environments. Our findings show that a richer, non-parametric representation of skill priors significantly improves both the learning and execution of challenging robotic tasks. All data, code, and videos are available at https://ghiara.github.io/HELIOS/.

Problem

Research questions and friction points this paper is trying to address.

Incorporating diverse prior knowledge into reinforcement learning for skill transfer

Overcoming rigid skill prior assumptions in long-horizon robotic tasks

Enhancing interpretability and control of learned skills in complex environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian non-parametric model for skill priors

Dirichlet Process Mixtures with birth/merge heuristics

Trackable skills in prior space for interpretability

🔎 Similar Papers

Long-Horizon Planning for Multi-Agent Robots in Partially Observable Environments