🤖 AI Summary
Existing trajectory world models struggle to scale to large fleets of heterogeneous robots and often neglect physical structure priors, limiting their zero-shot generalization capabilities. This work proposes WestWorld, a novel framework that explicitly incorporates physical priors through a system-aware Mixture-of-Experts architecture (Sys-MoE) and learnable structural embeddings aligned with robot morphology. The model undergoes large-scale multi-morphology trajectory pretraining across 89 complex environments, substantially improving zero- and few-shot trajectory prediction performance on unseen robots. Furthermore, WestWorld demonstrates strong transferability to downstream control tasks and has been successfully deployed on a real Unitree Go1 quadruped robot, achieving stable locomotion in physical environments.
📝 Abstract
Trajectory world models play a crucial role in robotic dynamics learning, planning, and control. While recent works have explored trajectory world models for diverse robotic systems, they struggle to scale to a large number of distinct system dynamics and overlook domain knowledge of physical structures. To address these limitations, we introduce WestWorld, a knoWledge-Encoded Scalable Trajectory World model for diverse robotic systems. To tackle the scalability challenge, we propose a novel system-aware Mixture-of-Experts (Sys-MoE) that dynamically combines and routes specialized experts for different robotic systems via a learnable system embedding. To further enhance zero-shot generalization, we incorporate domain knowledge of robot physical structures by introducing a structural embedding that aligns trajectory representations with morphological information. After pretraining on 89 complex environments spanning diverse morphologies across both simulation and real-world settings, WestWorld achieves significant improvements over competitive baselines in zero- and few-shot trajectory prediction. Additionally, it shows strong scalability across a wide range of robotic environments and significantly improves performance on downstream model-based control for different robots. Finally, we deploy our model on a real-world Unitree Go1, where it demonstrates stable locomotion performance (see our demo on the website: https://westworldrobot.github.io/). The code will be available upon publication.