🤖 AI Summary
Bridging the sim-to-real gap for low-cost quadrupedal robots performing long-horizon manipulation tasks (search, locomotion, grasping, placing) in complex real-world environments remains challenging.
Method: We propose a hierarchical reinforcement learning framework trained exclusively in simulation and driven by a single wrist-mounted RGB camera. It integrates visual perception and motor control via a novel teacher-student policy distillation mechanism and a progressive policy expansion architecture. To mitigate reality gaps, we systematically incorporate domain randomization, dynamics perturbations, and realistic observation noise modeling.
Results: Evaluated across diverse indoor/outdoor scenes and lighting conditions, our approach achieves an 82.3% average success rate on long-horizon tasks using real hardware—significantly outperforming existing baselines. To our knowledge, this is the first end-to-end sim-to-real deployment enabling full manipulation on low-cost quadrupeds using monocular vision alone.
📝 Abstract
We present a low-cost quadruped manipulation system that solves long-horizon real-world tasks, trained by reinforcement learning purely in simulation. The system comprises 1) a hierarchical design of a high-level policy for visual-mobile manipulation following instructions, and a low-level policy for quadruped movement and limb-control, 2) a progressive policy expansion approach for solving the long-horizon task together with a teacher-student framework for efficient high-level training of the high-level visuomotor policy, and 3) a suite of techniques for minimizing sim-to-real gaps. With budget-friendly but limited reliability and performance hardware, and just one wrist-mounted RGB camera, the entire system fully trained in simulation achieves high success rates for long horizon tasks involving search, move, grasp, and drop-into, with fluid sim-to-real transfer in a wide variety of indoor and outdoor scenes and lighting conditions.Extensive real-world evaluations show that on the long horizon mobile manipulation tasks, our system achieves good performance when transferred to real both in terms of task success rate and execution efficiency. Finally, we discuss the necessity of our sim-to-real techniques for legged mobile manipulation, and show their ablation performance.