🤖 AI Summary
This work addresses the challenge of deploying general-purpose humanoid motion trackers from simulation to real hardware, where interface and dynamics discrepancies often undermine long-horizon teleoperation robustness. The authors propose an open-source, full-stack system that trains a universal tracking policy via reinforcement learning and introduces a rapid residual adaptation mechanism: a lightweight residual module, trained on minimal interface-specific data, is additively fused into the universal policy to bridge the sim-to-real interface gap without compromising generalization. This approach outperforms conventional fine-tuning or continual learning strategies, demonstrating strong robustness in both offline replay and online long-duration teleoperation. Its efficacy is validated through ablation studies, out-of-distribution evaluations, and real-robot experiments.
📝 Abstract
Generalist humanoid motion trackers have recently achieved strong simulation metrics by scaling data and training, yet often remain brittle on hardware during sustained teleoperation due to interface- and dynamics-induced errors. We present MOSAIC, an open-source, full-stack system for humanoid motion tracking and whole-body teleoperation across multiple interfaces. MOSAIC first learns a teleoperation-oriented general motion tracker via RL on a multi-source motion bank with adaptive resampling and rewards that emphasize world-frame motion consistency, which is critical for mobile teleoperation. To bridge the sim-to-real interface gap without sacrificing generality, MOSAIC then performs rapid residual adaptation: an interface-specific policy is trained using minimal interface-specific data, and then distilled into the general tracker through an additive residual module, outperforming naive fine-tuning or continual learning. We validate MOSAIC with systematic ablations, out-of-distribution benchmarking, and real-robot experiments demonstrating robust offline motion replay and online long-horizon teleoperation under realistic latency and noise. Project page: baai-humanoid.github.io/MOSAIC.