Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning

๐Ÿ“… 2024-11-21
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address weak generalization across domains and the lack of explicit visual-action alignment in cross-domain trajectory prediction, this paper proposes Tra-MoE. Methodologically, it introduces: (i) a novel Top-1 sparse-gated Mixture-of-Experts (MoE) architecture that balances domain-specific parameter specialization with cross-domain collaboration; (ii) an adaptive conditioning mechanism leveraging 2D masked representations to explicitly model fine-grained alignment between visual observations and trajectory generationโ€”first of its kind; and (iii) joint cross-domain multi-task pretraining with vision-trajectory co-modeling. Evaluated on both simulation and real-robot tasks, Tra-MoE achieves substantial performance gains over dense baselines at comparable parameter count and constant FLOPs per token. It demonstrates significantly improved generalization and scalability, enabling high-precision, instruction-driven, fine-grained control for robotic policy learning.

Technology Category

Application Category

๐Ÿ“ Abstract
Learning from multiple domains is a primary factor that influences the generalization of a single unified robot system. In this paper, we aim to learn the trajectory prediction model by using broad out-of-domain data to improve its performance and generalization ability. Trajectory model is designed to predict any-point trajectories in the current frame given an instruction and can provide detailed control guidance for robotic policy learning. To handle the diverse out-of-domain data distribution, we propose a sparsely-gated MoE ( extbf{Top-1} gating strategy) architecture for trajectory model, coined as extbf{Tra-MoE}. The sparse activation design enables good balance between parameter cooperation and specialization, effectively benefiting from large-scale out-of-domain data while maintaining constant FLOPs per token. In addition, we further introduce an adaptive policy conditioning technique by learning 2D mask representations for predicted trajectories, which is explicitly aligned with image observations to guide action prediction more flexibly. We perform extensive experiments on both simulation and real-world scenarios to verify the effectiveness of Tra-MoE and adaptive policy conditioning technique. We also conduct a comprehensive empirical study to train Tra-MoE, demonstrating that our Tra-MoE consistently exhibits superior performance compared to the dense baseline model, even when the latter is scaled to match Tra-MoE's parameter count.
Problem

Research questions and friction points this paper is trying to address.

Improve trajectory prediction using multi-domain data
Balance parameter cooperation and specialization with MoE
Enhance policy learning via adaptive trajectory conditioning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparsely-gated MoE architecture for trajectory prediction
Top-1 gating strategy for diverse data distribution
Adaptive policy conditioning with 2D mask representations
๐Ÿ”Ž Similar Papers
No similar papers found.
Jiange Yang
Jiange Yang
Nanjing University
Deep LearningComputer VisionRoboticsEmbodied AI
Haoyi Zhu
Haoyi Zhu
Shanghai AI Lab | USTC | SJTU
World ModelSpatial IntelligenceRobot LearningEmbodied AI
Y
Yating Wang
Shanghai Artificial Intelligence Laboratory, Tongji University
G
Gangshan Wu
Nanjing University, Shanghai Artificial Intelligence Laboratory
T
Tong He
Shanghai Artificial Intelligence Laboratory
L
Limin Wang
Nanjing University, Shanghai Artificial Intelligence Laboratory