From planning to policy: distilling $ exttt{Skill-RRT}$ for long-horizon prehensile and non-prehensile manipulation

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Long-horizon, contact-rich manipulation tasks—such as grasping and non-grasping interactions—suffer from challenges in skill chaining, weak contact modeling, and low planning efficiency. Method: We propose Skill-RRT, an enhanced RRT planner that supports skill applicability verification and intermediate object pose sampling, coupled with goal-conditioned connectors for robust skill switching. Furthermore, we distill Skill-RRT into an efficient goal-conditioned policy via lazy-planning-driven sparse connectivity training and noise-augmented demonstration replay. Contribution/Results: Our method significantly outperforms MAPLE and the original Skill-RRT in simulation. Critically, the policy—trained exclusively in simulation—achieves zero-shot transfer to a real robot, attaining over 80% success rates across three complex manipulation tasks.

Technology Category

Application Category

📝 Abstract

Current robots face challenges in manipulation tasks that require a long sequence of prehensile and non-prehensile skills. This involves handling contact-rich interactions and chaining multiple skills while considering their long-term consequences. This paper presents a framework that leverages imitation learning to distill a planning algorithm, capable of solving long-horizon problems but requiring extensive computation time, into a policy for efficient action inference. We introduce $ exttt{Skill-RRT}$, an extension of the rapidly-exploring random tree (RRT) that incorporates skill applicability checks and intermediate object pose sampling for efficient long-horizon planning. To enable skill chaining, we propose $ extit{connectors}$, goal-conditioned policies that transition between skills while minimizing object disturbance. Using lazy planning, connectors are selectively trained on relevant transitions, reducing the cost of training. High-quality demonstrations are generated with $ exttt{Skill-RRT}$ and refined by a noise-based replay mechanism to ensure robust policy performance. The distilled policy, trained entirely in simulation, zero-shot transfer to the real world, and achieves over 80% success rates across three challenging manipulation tasks. In simulation, our approach outperforms the state-of-the-art skill-based reinforcement learning method, $ exttt{MAPLE}$, and $ exttt{Skill-RRT}$.

Problem

Research questions and friction points this paper is trying to address.

Efficient long-horizon manipulation tasks

Skill chaining with minimal object disturbance

Zero-shot transfer from simulation to reality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Imitation learning distills planning into policy

Skill-RRT enhances RRT for efficient planning

Connectors enable smooth skill transitions effectively

🔎 Similar Papers

No similar papers found.