Robot Control Stack: A Lean Ecosystem for Robot Learning at Scale

πŸ“… 2025-09-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Traditional robotic software frameworks struggle to support large-scale training of general-purpose policies and exhibit limited capability for simulation-to-real (sim-to-real) transfer. To address these challenges, this paper introduces RCSβ€”a lightweight, modular robotic learning ecosystem built upon a hierarchical architecture with unified control interfaces. RCS decouples perception, decision-making, and execution layers, enabling low-dependency, cross-platform seamless coordination between simulation and real-world robots. It natively supports vision-language-action (VLA) models (e.g., Octo, OpenVLA) and reinforcement learning algorithms, and integrates resource-constrained edge devices (e.g., Raspberry Pi Zero) for end-to-end training and deployment. Experimental evaluation demonstrates efficient sim-to-real transfer across diverse robot platforms, significantly improving policy generalization and robustness in real-world settings. RCS thus provides a scalable, extensible infrastructure for large-scale training and practical deployment of general-purpose robotic policies.

Technology Category

Application Category

πŸ“ Abstract
Vision-Language-Action models (VLAs) mark a major shift in robot learning. They replace specialized architectures and task-tailored components of expert policies with large-scale data collection and setup-specific fine-tuning. In this machine learning-focused workflow that is centered around models and scalable training, traditional robotics software frameworks become a bottleneck, while robot simulations offer only limited support for transitioning from and to real-world experiments. In this work, we close this gap by introducing Robot Control Stack (RCS), a lean ecosystem designed from the ground up to support research in robot learning with large-scale generalist policies. At its core, RCS features a modular and easily extensible layered architecture with a unified interface for simulated and physical robots, facilitating sim-to-real transfer. Despite its minimal footprint and dependencies, it offers a complete feature set, enabling both real-world experiments and large-scale training in simulation. Our contribution is twofold: First, we introduce the architecture of RCS and explain its design principles. Second, we evaluate its usability and performance along the development cycle of VLA and RL policies. Our experiments also provide an extensive evaluation of Octo, OpenVLA, and Pi Zero on multiple robots and shed light on how simulation data can improve real-world policy performance. Our code, datasets, weights, and videos are available at: https://robotcontrolstack.github.io/
Problem

Research questions and friction points this paper is trying to address.

Bridging robotics software bottleneck for scalable robot learning
Enabling seamless sim-to-real transfer in vision-language-action models
Providing unified interface for simulated and physical robot experiments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular layered architecture for unified robot interface
Lean ecosystem supporting large-scale generalist policy training
Facilitates sim-to-real transfer with minimal footprint dependencies
πŸ”Ž Similar Papers
No similar papers found.
T
Tobias JΓΌlg
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
P
Pierre Krack
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
S
Seongjin Bien
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
Y
Yannik Blei
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
K
Khaled Gamal
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
K
Ken Nakahara
Learning, Adaptive Systems and Robotics (LASR) Lab, Faculty of Computer Science, TU Dresden, Germany
J
Johannes Hechtl
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany; Siemens Foundational Technologies, Siemens AG, Germany
R
Roberto Calandra
Learning, Adaptive Systems and Robotics (LASR) Lab, Faculty of Computer Science, TU Dresden, Germany
Wolfram Burgard
Wolfram Burgard
Professor of Computer Science, University of Technology Nuremberg
RoboticsArtificial IntelligenceAIMachine LearningComputer Vision
Florian Walter
Florian Walter
University of Technology Nuremberg, Machine Intelligence Lab
Machine IntelligenceRoboticsMachine LearningAICognitive Robotics