MOSAIC: Bridging the Sim-to-Real Gap in Generalist Humanoid Motion Tracking and Teleoperation with Rapid Residual Adaptation

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of deploying general-purpose humanoid motion trackers from simulation to real hardware, where interface and dynamics discrepancies often undermine long-horizon teleoperation robustness. The authors propose an open-source, full-stack system that trains a universal tracking policy via reinforcement learning and introduces a rapid residual adaptation mechanism: a lightweight residual module, trained on minimal interface-specific data, is additively fused into the universal policy to bridge the sim-to-real interface gap without compromising generalization. This approach outperforms conventional fine-tuning or continual learning strategies, demonstrating strong robustness in both offline replay and online long-duration teleoperation. Its efficacy is validated through ablation studies, out-of-distribution evaluations, and real-robot experiments.

Technology Category

Application Category

📝 Abstract
Generalist humanoid motion trackers have recently achieved strong simulation metrics by scaling data and training, yet often remain brittle on hardware during sustained teleoperation due to interface- and dynamics-induced errors. We present MOSAIC, an open-source, full-stack system for humanoid motion tracking and whole-body teleoperation across multiple interfaces. MOSAIC first learns a teleoperation-oriented general motion tracker via RL on a multi-source motion bank with adaptive resampling and rewards that emphasize world-frame motion consistency, which is critical for mobile teleoperation. To bridge the sim-to-real interface gap without sacrificing generality, MOSAIC then performs rapid residual adaptation: an interface-specific policy is trained using minimal interface-specific data, and then distilled into the general tracker through an additive residual module, outperforming naive fine-tuning or continual learning. We validate MOSAIC with systematic ablations, out-of-distribution benchmarking, and real-robot experiments demonstrating robust offline motion replay and online long-horizon teleoperation under realistic latency and noise. Project page: baai-humanoid.github.io/MOSAIC.
Problem

Research questions and friction points this paper is trying to address.

sim-to-real gap
humanoid motion tracking
teleoperation
interface-induced errors
dynamics mismatch
Innovation

Methods, ideas, or system contributions that make the work stand out.

rapid residual adaptation
sim-to-real transfer
generalist humanoid motion tracking
whole-body teleoperation
additive residual module
🔎 Similar Papers
No similar papers found.
Z
Zhenguo Sun
Beijing Academy of Artificial Intelligence, 100084 Beijing, China.
B
Bo-Sheng Huang
Beijing Academy of Artificial Intelligence, 100084 Beijing, China.
Yibo Peng
Yibo Peng
Carnegie Mellon University
Code GenerationMultimodal NLPAI Agents
Xukun Li
Xukun Li
Kansas State University
computer visionmachine learningdeep learningstatistical modeling
J
Jingyu Ma
Beijing Academy of Artificial Intelligence, 100084 Beijing, China.
Y
Yu Sun
Beijing Academy of Artificial Intelligence, 100084 Beijing, China.
Z
Zhe Li
Beijing Academy of Artificial Intelligence, 100084 Beijing, China.
H
Haojun Jiang
Tsinghua University, 100084 Beijing, China.
Biao Gao
Biao Gao
Institute of Information Engineering, Chinese Academy of Sciences
Information SecurityData Security
Zhenshan Bing
Zhenshan Bing
Nanjing University / Technical University of Munich
Robotics
Xinlong Wang
Xinlong Wang
Beijing Academy of Artificial Intelligence
Computer VisionFoundation Models
Alois Knoll
Alois Knoll
Technische Universität München
RoboticsAISensor Data FusionAutonomous DrivingCyber Physical Systems