🤖 AI Summary
To address the limitations of Generalized Motion Tracking (GMT) strategies in whole-body locomotion-manipulation tasks for humanoid robots—specifically their lack of object awareness and insufficient manipulation precision—this paper proposes ResMimic, a two-stage residual learning framework. First, GMT generates a human-like motion baseline; second, a residual policy learns fine-grained object interaction and contact control. Key innovations include a point-cloud-based perception reward, explicit contact-state modeling, and a curriculum-driven virtual object controller—collectively enhancing training stability and policy generalization. The framework is co-optimized and validated in simulation and on the Unitree G1 physical platform. Results demonstrate significant improvements over strong baselines across task success rate, sample efficiency, and system robustness. These advances confirm ResMimic’s practical feasibility for deploying humanoid robots in complex manipulation tasks.
📝 Abstract
Humanoid whole-body loco-manipulation promises transformative capabilities for daily service and warehouse tasks. While recent advances in general motion tracking (GMT) have enabled humanoids to reproduce diverse human motions, these policies lack the precision and object awareness required for loco-manipulation. To this end, we introduce ResMimic, a two-stage residual learning framework for precise and expressive humanoid control from human motion data. First, a GMT policy, trained on large-scale human-only motion, serves as a task-agnostic base for generating human-like whole-body movements. An efficient but precise residual policy is then learned to refine the GMT outputs to improve locomotion and incorporate object interaction. To further facilitate efficient training, we design (i) a point-cloud-based object tracking reward for smoother optimization, (ii) a contact reward that encourages accurate humanoid body-object interactions, and (iii) a curriculum-based virtual object controller to stabilize early training. We evaluate ResMimic in both simulation and on a real Unitree G1 humanoid. Results show substantial gains in task success, training efficiency, and robustness over strong baselines. Videos are available at https://resmimic.github.io/ .