Arcadia: Toward a Full-Lifecycle Framework for Embodied Lifelong Learning

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing embodied lifelong learning systems typically optimize isolated components—such as data collection or deployment—independently, hindering sustained improvement and cross-environment generalization. This paper introduces Arcadia, the first holistic framework that models embodied learning as an indivisible closed-loop lifecycle, encompassing four tightly coupled stages: autonomous exploration, generative scene reconstruction, shared multimodal representation learning, and simulation-driven evolution. Its key innovations include establishing the first sim-from-real feedback loop between physical and virtual domains, unifying self-evolving exploration, generative data augmentation, and a unified multimodal representation architecture. Arcadia enables reproducible, cross-task and cross-environment evaluation. Empirically, it achieves continuous performance gains on navigation and manipulation benchmarks and successfully transfers learned policies to real-world robots, demonstrating robustness and generalizability.

Technology Category

Application Category

📝 Abstract
We contend that embodied learning is fundamentally a lifecycle problem rather than a single-stage optimization. Systems that optimize only one link (data collection, simulation, learning, or deployment) rarely sustain improvement or generalize beyond narrow settings. We introduce Arcadia, a closed-loop framework that operationalizes embodied lifelong learning by tightly coupling four stages: (1) Self-evolving exploration and grounding for autonomous data acquisition in physical environments, (2) Generative scene reconstruction and augmentation for realistic and extensible scene creation, (3) a Shared embodied representation architecture that unifies navigation and manipulation within a single multimodal backbone, and (4) Sim-from-real evaluation and evolution that closes the feedback loop through simulation-based adaptation. This coupling is non-decomposable: removing any stage breaks the improvement loop and reverts to one-shot training. Arcadia delivers consistent gains on navigation and manipulation benchmarks and transfers robustly to physical robots, indicating that a tightly coupled lifecycle: continuous real-world data acquisition, generative simulation update, and shared-representation learning, supports lifelong improvement and end-to-end generalization. We release standardized interfaces enabling reproducible evaluation and cross-model comparison in reusable environments, positioning Arcadia as a scalable foundation for general-purpose embodied agents.
Problem

Research questions and friction points this paper is trying to address.

Develops a closed-loop framework for embodied lifelong learning
Integrates autonomous data acquisition, generative simulation, and shared representation
Enables continuous improvement and generalization in physical robots
Innovation

Methods, ideas, or system contributions that make the work stand out.

Closed-loop framework coupling four lifecycle stages
Generative scene reconstruction for realistic simulation
Shared multimodal representation for navigation and manipulation
🔎 Similar Papers
No similar papers found.
Minghe Gao
Minghe Gao
浙江大学
机器学习
Juncheng Li
Juncheng Li
East China Normal University
Super ResolutionImage RestorationComputer VisionMedical Image Analysis
Y
Yuze Lin
Zhejiang University
X
Xuqi Liu
Zhejiang University
J
Jiaming Ji
Peking University
X
Xiaoran Pan
Zhejiang University
Zihan Xu
Zihan Xu
Arizona State University
Machine LearningNeuromorphic ComputingMemory
X
Xian Li
Zhejiang University
M
Mingjie Li
Unitree Tech
W
Wei Ji
Nanjing University
R
Rong Wei
Manycore Tech
R
Rui Tang
Manycore Tech
Qizhou Wang
Qizhou Wang
PhD @ HKBU
machine learning
Kai Shen
Kai Shen
Associate Professor of Computer Science, University of Rochester
Computer Systems
J
Jun Xiao
Zhejiang University
Q
Qi Wu
University of Adelaide
Siliang Tang
Siliang Tang
Professor of Computer Science, Zhejiang University
Natural Language ProcessingCross-media AnalysisGraph Neural Network
Y
Yueting Zhuang
Zhejiang University