๐ค AI Summary
To address motion instability and inefficient training arising from kinematic heterogeneity between humans and humanoid robots in imitation learning, this paper proposes a physically feasible whole-body imitation framework that eliminates the need for motion retargeting. Methodologically: (1) we introduce a novel vector-quantized periodic autoencoder to learn compact, universal atomic motion representations; (2) we design a privileged teacherโstudent decoupled reward distillation mechanism to bridge human and robot action spaces; and (3) we integrate self-supervised motion adaptation with whole-body pose tracking control. Extensive evaluations in simulation and on real robotic platforms demonstrate substantial improvements in motion stability, cross-task generalization, and training convergence speed. The framework successfully reproduces complex, previously unseen motions and consistently outperforms existing state-of-the-art methods.
๐ Abstract
This paper presents a novel framework that enables real-world humanoid robots to maintain stability while performing human-like motion. Current methods train a policy which allows humanoid robots to follow human body using the massive retargeted human data via reinforcement learning. However, due to the heterogeneity between human and humanoid robot motion, directly using retargeted human motion reduces training efficiency and stability. To this end, we introduce SMAP, a novel whole-body tracking framework that bridges the gap between human and humanoid action spaces, enabling accurate motion mimicry by humanoid robots. The core idea is to use a vector-quantized periodic autoencoder to capture generic atomic behaviors and adapt human motion into physically plausible humanoid motion. This adaptation accelerates training convergence and improves stability when handling novel or challenging motions. We then employ a privileged teacher to distill precise mimicry skills into the student policy with a proposed decoupled reward. We conduct experiments in simulation and real world to demonstrate the superiority stability and performance of SMAP over SOTA methods, offering practical guidelines for advancing whole-body control in humanoid robots.