Empowering Multi-Robot Cooperation via Sequential World Models

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses two key challenges in multi-robot coordination: the difficulty of joint dynamics modeling and low communication efficiency. To this end, we propose the Sequentialized World Model (SeqWM) framework, which decouples joint dynamics into temporally structured, agent-wise world models and introduces a sequential communication mechanism to enable explicit intent sharing and linear communication complexity—thereby supporting predictive adaptation and dynamic role assignment. Technically, SeqWM integrates latent-variable rollout modeling, agent-specific world model learning, and model-driven multi-agent reinforcement learning. Evaluated on Bi-DexHands and Multi-Quad simulation benchmarks, SeqWM significantly outperforms existing model-based RL approaches in both task performance and sample efficiency. Furthermore, it has been successfully deployed on a physical quadruped robot platform, demonstrating strong generalization capability and practical applicability.

Technology Category

Application Category

📝 Abstract
Model-based reinforcement learning (MBRL) has shown significant potential in robotics due to its high sample efficiency and planning capability. However, extending MBRL to multi-robot cooperation remains challenging due to the complexity of joint dynamics. To address this, we propose the Sequential World Model (SeqWM), a novel framework that integrates the sequential paradigm into model-based multi-agent reinforcement learning. SeqWM employs independent, sequentially structured agent-wise world models to decompose complex joint dynamics. Latent rollouts and decision-making are performed through sequential communication, where each agent generates its future trajectory and plans its actions based on the predictions of its predecessors. This design enables explicit intention sharing, enhancing cooperative performance, and reduces communication overhead to linear complexity. Results in challenging simulated environments (Bi-DexHands and Multi-Quad) show that SeqWM outperforms existing state-of-the-art model-free and model-based baselines in both overall performance and sample efficiency, while exhibiting advanced cooperative behaviors such as predictive adaptation and role division. Furthermore, SeqWM has been success fully deployed on physical quadruped robots, demonstrating its effectiveness in real-world multi-robot systems. Demos and code are available at: https://github.com/zhaozijie2022/seqwm-marl
Problem

Research questions and friction points this paper is trying to address.

Extending model-based reinforcement learning to multi-robot cooperation
Reducing communication overhead in multi-agent systems
Enhancing cooperative performance through explicit intention sharing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential World Model for multi-agent reinforcement learning
Independent agent-wise models decompose joint dynamics
Sequential communication enables intention sharing with linear complexity
🔎 Similar Papers
No similar papers found.
Z
Zijie Zhao
School of Artificial Intelligence, University of Chinese Academy of Sciences
H
Honglei Guo
Institute of Automation, Chinese Academy of Sciences
S
Shengqian Chen
Institute of Automation, Chinese Academy of Sciences
K
Kaixuan Xu
School of Artificial Intelligence, University of Chinese Academy of Sciences
B
Bo Jiang
Institute of Automation, Chinese Academy of Sciences
Yuanheng Zhu
Yuanheng Zhu
Institute of Automation, Chinese Academy of Sciences
Dongbin Zhao
Dongbin Zhao
Institute of Automation, Chinese Academy of Sciences
Deep Reinforcement LearningAdaptive Dynamic ProgrammingGame AISmart drivingrobotics