š¤ AI Summary
Addressing the challenges of high task complexity and scarce training data in long-horizon robotic manipulation, this paper proposes a hierarchical skill-policy framework. First, complex tasks are decomposed into reusable local skills, with motion planning ensuring coherent action sequences. Second, high-quality training data are automatically generated from only ten human demonstrations, followed by imitation learning. Third, online adaptation is integrated with PPO/SAC-based reinforcement learning for joint fine-tuning of skill modules and motion objectives. This work introduces the first closed-loop paradigm unifying ātask decompositionāautomated data generationāimitation learningāreinforcement fine-tuning.ā Evaluated on RoboSuite under the highest reset difficulty, the approach achieves an 80% success rate in visionāmotor control tasks. Ablation studies demonstrate that the joint fine-tuning stage improves average performance by 89%.
š Abstract
Long-horizon manipulation has been a long-standing challenge in the robotics community. We propose ReinforceGen, a system that combines task decomposition, data generation, imitation learning, and motion planning to form an initial solution, and improves each component through reinforcement-learning-based fine-tuning. ReinforceGen first segments the task into multiple localized skills, which are connected through motion planning. The skills and motion planning targets are trained with imitation learning on a dataset generated from 10 human demonstrations, and then fine-tuned through online adaptation and reinforcement learning. When benchmarked on the Robosuite dataset, ReinforceGen reaches 80% success rate on all tasks with visuomotor controls in the highest reset range setting. Additional ablation studies show that our fine-tuning approaches contributes to an 89% average performance increase. More results and videos available in https://reinforcegen.github.io/