RoboOS-NeXT: A Unified Memory-based Framework for Lifelong, Scalable, and Robust Multi-Robot Collaboration

📅 2025-10-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak lifelong adaptability, poor scalability, and insufficient scheduling robustness in multi-robot systems—limitations arising from agent-centric, short-term memory architectures that hinder long-term learning, heterogeneous team scaling, and fault recovery—this paper proposes a memory-augmented collaborative framework. Its core innovation is Spatio-Temporal Embodied Memory (STEM), a unified representation integrating spatial structure, temporal events, and embodied features to enable global, cross-heterogeneous-robot shared memory and cerebrum-cerebellum hierarchical coordination. Coupled with a vision-language-action model and a hierarchical control architecture, the framework establishes a closed “cognition–memory–execution” loop. Evaluated in restaurant, supermarket, and home environments, it significantly improves task completion rate and collaboration efficiency, supports scalable deployment across >1,000 robots, sustains continuous operation for over 72 hours, and enables autonomous recovery from dynamic failures.

Technology Category

Application Category

📝 Abstract
The proliferation of collaborative robots across diverse tasks and embodiments presents a central challenge: achieving lifelong adaptability, scalable coordination, and robust scheduling in multi-agent systems. Existing approaches, from vision-language-action (VLA) models to hierarchical frameworks, fall short due to their reliance on limited or dividual-agent memory. This fundamentally constrains their ability to learn over long horizons, scale to heterogeneous teams, or recover from failures, highlighting the need for a unified memory representation. To address these limitations, we introduce RoboOS-NeXT, a unified memory-based framework for lifelong, scalable, and robust multi-robot collaboration. At the core of RoboOS-NeXT is the novel Spatio-Temporal-Embodiment Memory (STEM), which integrates spatial scene geometry, temporal event history, and embodiment profiles into a shared representation. This memory-centric design is integrated into a brain-cerebellum framework, where a high-level brain model performs global planning by retrieving and updating STEM, while low-level controllers execute actions locally. This closed loop between cognition, memory, and execution enables dynamic task allocation, fault-tolerant collaboration, and consistent state synchronization. We conduct extensive experiments spanning complex coordination tasks in restaurants, supermarkets, and households. Our results demonstrate that RoboOS-NeXT achieves superior performance across heterogeneous embodiments, validating its effectiveness in enabling lifelong, scalable, and robust multi-robot collaboration. Project website: https://flagopen.github.io/RoboOS/
Problem

Research questions and friction points this paper is trying to address.

Achieving lifelong adaptability in multi-robot collaboration systems
Enabling scalable coordination across heterogeneous robot teams
Providing robust scheduling and fault-tolerant collaboration capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified memory framework integrates spatial-temporal-embodiment data
Brain-cerebellum architecture separates planning from local execution
Shared memory enables dynamic task allocation and fault tolerance
🔎 Similar Papers
No similar papers found.
Huajie Tan
Huajie Tan
Peking University
Embodied AIFoundation Models
Cheng Chi
Cheng Chi
Columbia University, Stanford University
robotics
X
Xiansheng Chen
Beijing Academy of Artificial Intelligence
Yuheng Ji
Yuheng Ji
Institute of Automation, Chinese Academy of Sciences
Embodied AIComputer Vision
Z
Zhongxia Zhao
Beijing Academy of Artificial Intelligence
Xiaoshuai Hao
Xiaoshuai Hao
Beijing Academy of Artificial Intelligence,BAAI
vision and language
Y
Yaoxu Lyu
State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University
M
Mingyu Cao
Beijing Academy of Artificial Intelligence
J
Junkai Zhao
Beijing Academy of Artificial Intelligence
Huaihai Lyu
Huaihai Lyu
Institute of Automation
multi-modalembodied intelligence
Enshen Zhou
Enshen Zhou
Beihang University
Embodied AIEmbodied AgentRobot LearningGenerative Model
N
Ning Chen
State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University
Y
Yankai Fu
State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University
C
Cheng Peng
Beijing Academy of Artificial Intelligence
W
Wei Guo
Beijing Academy of Artificial Intelligence
D
Dong Liang
Beijing Academy of Artificial Intelligence
Z
Zhuo Chen
Beijing Academy of Artificial Intelligence
M
Mengsi Lyu
Beijing Academy of Artificial Intelligence
C
Chenrui He
Beijing Academy of Artificial Intelligence
Y
Yulong Ao
Beijing Academy of Artificial Intelligence
Y
Yonghua Lin
Beijing Academy of Artificial Intelligence
Pengwei Wang
Pengwei Wang
University of Calgary
Computer Science Security
Z
Zhongyuan Wang
Beijing Academy of Artificial Intelligence
Shanghang Zhang
Shanghang Zhang
Peking University
Embodied AIFoundation Models