🤖 AI Summary
This work addresses the challenge of deploying quadrupedal mobile manipulators to perform multi-stage pick-and-place tasks (search → approach → grasp → transport → place) in partially observable, long-horizon, cross-indoor-outdoor real-world environments. We propose an end-to-end vision–motor policy framework trained entirely in simulation. Our method integrates hierarchical task decomposition, domain-randomized reinforcement learning, and self-supervised sim-to-real transfer. Key contributions include: (i) the first emergence of robust behaviors—such as regrasping and task chaining—in quadrupedal mobile manipulation; (ii) strong generalization to complex, unseen environments; and (iii) zero-shot deployment without real-world fine-tuning. Experiments demonstrate ≈80% task success rate in real-world settings. Ablation studies confirm the efficacy of each technical component.
📝 Abstract
Quadruped-based mobile manipulation presents significant challenges in robotics due to the diversity of required skills, the extended task horizon, and partial observability. After presenting a multi-stage pick-and-place task as a succinct yet sufficiently rich setup that captures key desiderata for quadruped-based mobile manipulation, we propose an approach that can train a visuo-motor policy entirely in simulation, and achieve nearly 80% success in the real world. The policy efficiently performs search, approach, grasp, transport, and drop into actions, with emerged behaviors such as re-grasping and task chaining. We conduct an extensive set of real-world experiments with ablation studies highlighting key techniques for efficient training and effective sim-to-real transfer. Additional experiments demonstrate deployment across a variety of indoor and outdoor environments. Demo videos and additional resources are available on the project page: https://horizonrobotics.github.io/gail/SLIM.