DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

📅 2024-03-19
🏛️ Robotics: Science and Systems
📈 Citations: 151
Influential: 11
📄 PDF
🤖 AI Summary
Existing robot manipulation policies suffer from limited generalization due to reliance on small-scale, low-diversity simulation data or single-environment real-world datasets. To address this, we introduce DROID—the first large-scale, cross-household, real-world distributed robot manipulation dataset. It encompasses 564 diverse household environments, 84 task categories, and 76k high-quality trajectories (350 hours), collected over 12 months by 50 geographically distributed contributors. DROID pioneers intercontinental, multi-brand robotic hardware coordination (UR5e and Franka Emika arms) via remote distributed data collection, integrating standardized interfaces, precise action alignment, and rigorous quality filtering. We fully open-source the hardware specifications, data collection infrastructure, and training code. Policies trained on DROID achieve a 27% average success rate improvement in cross-scene generalization benchmarks and demonstrate superior zero-shot transfer performance compared to prior state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.
Problem

Research questions and friction points this paper is trying to address.

Lack of diverse large-scale robot manipulation datasets
Challenges in collecting data across varied environments safely
Limited generalization of current robot manipulation policies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale diverse robot manipulation dataset
Collected across 564 scenes and 84 tasks
Open source dataset and learning code
A
Alexander Khazatsky
1
Karl Pertsch
Karl Pertsch
UC Berkeley, Stanford University
Artificial IntelligenceMachine LearningRobotics
S
Suraj Nair
1
Ashwin Balakrishna
Ashwin Balakrishna
Physical Intelligence
RoboticsMachine LearningReinforcement LearningImitation Learning
Sudeep Dasari
Sudeep Dasari
Google DeepMind
Robotic LearningUnsupervised LearningComputer Vision
Siddharth Karamcheti
Siddharth Karamcheti
PhD Student - Stanford University
RoboticsHuman-Robot InteractionNatural Language ProcessingMachine Learning
Soroush Nasiriany
Soroush Nasiriany
The University of Texas at Austin
Artificial IntelligenceMachine LearningRobotics
Mohan Kumar Srirama
Mohan Kumar Srirama
Carnegie Mellon University
Robot LearningUnsupervised LearningComputer Vision
Lawrence Yunliang Chen
Lawrence Yunliang Chen
PhD Student, UC Berkeley
RoboticsMachine Learning
K
Kirsty Ellis
6
P
Peter David Fagan
7
Joey Hejna
Joey Hejna
Stanford University
Reinforcement LearningMachine Learning
Masha Itkina
Masha Itkina
Toyota Research Institute (TRI)
artificial intelligenceperceptionautonomous vehicles
Marion Lepert
Marion Lepert
Stanford University
J
Jason Ma
14
P
Patrick Tree Miller
3
J
Jimmy Wu
8
Suneel Belkhale
Suneel Belkhale
Stanford University
RoboticsAI
Shivin Dass
Shivin Dass
PhD Student, UT Austin
Artificial IntelligenceMachine LearningRobot Learning
Huy Ha
Huy Ha
Columbia University, Stanford University
RoboticsReinforcement Learning3D vision
A
Abraham Lee
2
Youngwoon Lee
Youngwoon Lee
Assistant Professor at Yonsei University
Reinforcement learningRobot learning
Arhan Jain
Arhan Jain
University of Washington
Marius Memmel
Marius Memmel
University of Washington
RoboticsReinforcement LearningComputer Vision
Sungjae Park
Sungjae Park
Carnegie Mellon University
RoboticsRobot Learning
Ilija Radosavovic
Ilija Radosavovic
UC Berkeley
Computer VisionMachine Learning
Kaiyuan Wang
Kaiyuan Wang
Staff Software Engineer, Google
Machine LearningSoftware Engineering
Albert Zhan
Albert Zhan
6
K
Kevin Black
2
Cheng Chi
Cheng Chi
Columbia University, Stanford University
robotics
K
Kyle Hatch
3
S
Shan Lin
11
Jingpei Lu
Jingpei Lu
Intuitive Surgical
Computer VisionSurgical Robotics
A
Abdul Rehman
7
Pannag R Sanketi
Pannag R Sanketi
Google Deepmind, Robotics
RoboticsMachine learningReinforcement LearningOptimal ControlComputer vision
A
Archit Sharma
1
C
Cody Simpson
3
Quan Vuong
Quan Vuong
Physical Intelligence
Reinforcement LearningComputer Vision
Homer Walke
Homer Walke
University of California - Berkeley
roboticsmachine learningreinforcement learning
B
Blake Wulfe
3
Ted Xiao
Ted Xiao
Staff Research Scientist, Google DeepMind
Deep LearningArtificial IntelligenceRoboticsReinforcement LearningControl Theory
J
Jonathan Yang
1
A
Arefeh Yavary
13
Tony Z. Zhao
Tony Z. Zhao
Stanford University
RoboticsMachine LearningNLP
Christopher Agia
Christopher Agia
PhD Student in Computer Science, Stanford University
RoboticsMachine LearningComputer VisionArtificial Intelligence
Rohan Baijal
Rohan Baijal
PhD Student, University of Washington
Robotics
Mateo Guaman Castro
Mateo Guaman Castro
University of Washington
Robot Learning
Daphne Chen
Daphne Chen
University of Washington
RoboticsMachine Learning
Qiuyu Chen
Qiuyu Chen
University of Washington
Vision and Robotics
T
Trinity Chung
2
J
Jaimyn Drake
2
E
Ethan Paul Foster
1
Jensen Gao
Jensen Gao
Stanford University
Machine LearningRoboticsReinforcement Learning
D
David Antonio Herrera
1
Minho Heo
Minho Heo
KAIST
Robot Learning
Kyle Hsu
Kyle Hsu
Stanford University
artificial intelligencemachine learningrobotics
Jiaheng Hu
Jiaheng Hu
UT-Austin
Robot LearningReinforcement LearningRoboticsMobile Manipulation
D
Donovon Jackson
3
C
Charlotte Le
2
Y
Yunshuang Li
14
K
Kevin Lin
1
R
Roy Lin
2
Z
Zehan Ma
2
A
Abhiram Maddukuri
5
Suvir Mirchandani
Suvir Mirchandani
Stanford University
Daniel Morton
Daniel Morton
PhD student, Stanford University
RoboticsOptimizationManipulationControl
T
Tony Nguyen
3
A
Abby O’Neill
2
Rosario Scalise
Rosario Scalise
University of Washington
Artificial IntelligenceRoboticsMachine LearningOptimal ControlNLP
D
Derick Seale
3
V
Victor Son
1
Stephen Tian
Stephen Tian
Stanford University
roboticscomputer visionmachine learningreinforcement learning
Andrew Wang
Andrew Wang
University of Toronto, Vector Institute
AI Safety
Yilin Wu
Yilin Wu
Robotics PhD at CMU
Reinforcement learningRobotics
A
Annie Xie
1
Jingyun Yang
Jingyun Yang
PhD Student, Stanford University
Patrick Yin
Patrick Yin
University of Washington
artificial intelligencemachine learningrobotics
Yunchu Zhang
Yunchu Zhang
9
Osbert Bastani
Osbert Bastani
University of Pennsylvania
Machine LearningArtificial IntelligenceProgramming LanguagesSecurity
Glen Berseth
Glen Berseth
Assitant Professor - Université de Montréal
Reinforcement LearningRoboticsDeep LearningMachine Learning
Jeannette Bohg
Jeannette Bohg
Assistant Professor, Stanford University
RoboticsMulti-Modal PerceptionMachine LearningComputer VisionGrasping and Manipulation
Ken Goldberg
Ken Goldberg
Professor, UC Berkeley and UCSF
RobotsRoboticsAutomationCollaborative Filtering
A
Abhinav Gupta
4
A
Abhishek Gupta
9
Dinesh Jayaraman
Dinesh Jayaraman
Assistant Professor, University of Pennsylvania
robot learningcomputer visionroboticsmachine learning
Joseph J. Lim
Joseph J. Lim
Associate Professor at KAIST
Machine LearningAIReinforcement LearningRobotics
J
Jitendra Malik
2
Roberto Martín-Martín
Roberto Martín-Martín
The University of Texas at Austin
RoboticsArtificial PerceptionMachine LearningInteractive PerceptionProbabilistic Reasoning
Subramanian Ramamoorthy
Subramanian Ramamoorthy
Professor of Robot Learning and Autonomy, School of Informatics, University of Edinburgh
RoboticsHuman-Robot InteractionAutonomous SystemsArtificial IntelligenceCognitive Science
Dorsa Sadigh
Dorsa Sadigh
Stanford University
RoboticsHuman-Robot InteractionMachine LearningArtificial IntelligenceControl Theory
Shuran Song
Shuran Song
Stanford University
RoboticsComputer VisionMachine Learning