Open X-Embodiment: Robotic Learning Datasets and RT-X Models

📅 2023-10-13
🏛️ arXiv.org
📈 Citations: 451
Influential: 48
📄 PDF
🤖 AI Summary
Robot learning suffers from fragmentation, requiring separate model training for each robot platform, task, and environment. Method: This paper introduces X-robot, a universal policy paradigm enabling the first cross-platform positive transfer. We construct the largest standardized robotic manipulation dataset to date, developed collaboratively across multiple institutions. We propose RT-X, a Transformer-based architecture that unifies action representations and cross-robot data formats, trained via large-scale multi-robot behavioral cloning and transfer learning. Contribution/Results: RT-X demonstrates significant skill generalization across 22 heterogeneous robot platforms. In zero-shot and few-shot settings on unseen robots, it achieves substantial improvements in task success rates. This work provides the first systematic empirical validation of both the effectiveness and scalability of cross-robot positive transfer—establishing a foundational step toward generalizable robotic policies.
📝 Abstract
Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website https://robotics-transformer-x.github.io.
Problem

Research questions and friction points this paper is trying to address.

Can generalist X-robot policies replace application-specific robotic models?
How to train adaptable models for diverse robots, tasks, and environments?
Does large-scale multi-robot data enable effective transfer learning?
Innovation

Methods, ideas, or system contributions that make the work stand out.

Standardized datasets for diverse robotic learning
Generalist X-robot policy for multiple applications
High-capacity RT-X model enabling positive transfer
🔎 Similar Papers
No similar papers found.
A
Abhishek Padalkar
A
Acorn Pooley
A
Ajinkya Jain
Alex Bewley
Alex Bewley
Google DeepMind
RoboticsMachine LearningComputer VisionVision Language Models
A
Alex Herzog
A
A. Irpan
A
Alexander Khazatsky
A
Anant Rai
Anikait Singh
Anikait Singh
Stanford University
Deep LearningReinforcement LearningRobotics
A
Anthony Brohan
Antonin Raffin
Antonin Raffin
A
Ayzaan Wahid
Ben Burgess-Limerick
Ben Burgess-Limerick
Beomjoon Kim
Beomjoon Kim
Korea Advanced Institute of Science & Technology (KAIST)
Machine LearningRoboticsArtificial Intelligence
Bernhard Schölkopf
Bernhard Schölkopf
Director, Max Planck Institute for Intelligent Systems & ELLIS Institute Tübingen; Professor at ETH
Machine LearningCausal InferenceArtificial IntelligenceComputational PhotographyStatistics
Brian Ichter
Brian Ichter
Physical Intelligence
RoboticsMachine LearningFoundation Models
C
Cewu Lu
Charles Xu
Charles Xu
PhD Student, MIT
Geometric learninggraph signal processing
Chelsea Finn
Chelsea Finn
Stanford University, Physical Intelligence
machine learningroboticsreinforcement learning
Chenfeng Xu
Chenfeng Xu
UC Berkeley
Efficient Generative AIEfficient Machine LearningEfficient ComputationAI SystemsRobotics
Cheng Chi
Cheng Chi
Columbia University, Stanford University
robotics
Chenguang Huang
Chenguang Huang
University of Freiburg
RoboticsMachine LearningRobot Learning
C
Christine Chan
Chuer Pan
Chuer Pan
Stanford University
RoboticsComputer Vision
Chuyuan Fu
Chuyuan Fu
Google DeepMind
RoboticsSimulationComputer GraphicsSolid and Fluid Mechanics
Coline Devin
Coline Devin
DeepMind
Artificial IntelligenceMachine LearningReinforcement Learning
Danny Driess
Danny Driess
Google DeepMind
Machine LearningRobotics
D
Deepak Pathak
Dhruv Shah
Dhruv Shah
Princeton University, Google DeepMind
Robot LearningArtificial IntelligenceRoboticsReinforcement Learning
Dieter Büchler
Dieter Büchler
University of Alberta | MPI for Intelligent Systems | Amii | CIFAR
Robot LearningMachine LearningRoboticsSoft RobotsMusculoskeletal Systems
Dmitry Kalashnikov
Dmitry Kalashnikov
Google
RoboticsMachine LearningReinforcement Learning
Dorsa Sadigh
Dorsa Sadigh
Stanford University
RoboticsHuman-Robot InteractionMachine LearningArtificial IntelligenceControl Theory
Edward Johns
Edward Johns
Associate Professor in Robot Learning at Imperial College London
Robot LearningRobot ManipulationRoboticsComputer VisionMachine Learning
Federico Ceola
Federico Ceola
Istituto Italiano di Tecnologia
RoboticsRobot LearningReinforcement LearningRobotic Vision
F
Fei Xia
G
Gaoyue Zhou
G
Gaurav S. Sukhatme
Gautam Salhotra
Gautam Salhotra
USC, Intrinsic LLC
G
Ge Yan
G
Giulio Schiavi
H
Hao Su
H
Haoshu Fang
H
Haochen Shi
H
Henrik I Christensen
Hiroki Furuta
Hiroki Furuta
Google DeepMind
Large Language ModelsReinforcement LearningMachine Learning
I
Igor Mordatch
Ilija Radosavovic
Ilija Radosavovic
UC Berkeley
Computer VisionMachine Learning
I
Isabel Leal
Jacky Liang
Jacky Liang
Google DeepMind
RoboticsFoundation Models
J
Jaehyung Kim
J
Jan Schneider
J
Jasmine Hsu
Jeannette Bohg
Jeannette Bohg
Assistant Professor, Stanford University
RoboticsMulti-Modal PerceptionMachine LearningComputer VisionGrasping and Manipulation
J
Jiajun Wu
J
Jialin Wu
Jianlan Luo
Jianlan Luo
UC Berkeley, Google X
RoboticsMachine LearningArtificial Intelligence
Jiayuan Gu
Jiayuan Gu
Assistant Professor, ShanghaiTech University
Embodied AI3D Vision
Jie Tan
Jie Tan
Google DeepMind
Artificial General IntelligenceRoboticsFoundation ModelComputer GraphicsVision
J
Jihoon Oh
Joseph J. Lim
Joseph J. Lim
Associate Professor at KAIST
Machine LearningAIReinforcement LearningRobotics
João Silvério
João Silvério
German Aerospace Center (DLR)
RoboticsMachine Learning
J
Junhyek Han
Karl Pertsch
Karl Pertsch
UC Berkeley, Stanford University
Artificial IntelligenceMachine LearningRobotics
Karol Hausman
Karol Hausman
Physical Intelligence, Stanford
machine learningroboticsreinforcement learning
K
Keegan Go
Ken Goldberg
Ken Goldberg
Professor, UC Berkeley and UCSF
RobotsRoboticsAutomationCollaborative Filtering
K
Kendra Byrne
K
Kenneth Oslund
Kento Kawaharazuka
Kento Kawaharazuka
The University of Tokyo
HumanoidBiomimeticsTendon-drivenSoft RoboticsMachine Learning
K
Kevin Zhang
Krishan Rana
Krishan Rana
Queensland University of Technology
RoboticsAIImitation LearningReinforcement LearningTask and Motion Planning
Sergey Levine
Sergey Levine
UC Berkeley, Physical Intelligence
Machine LearningRoboticsReinforcement Learning