🤖 AI Summary
This work addresses the limitations of robot learning imposed by costly and non-scalable data collection, as well as the small scale, inconsistent formats, and poor interoperability of existing human demonstration datasets. To overcome these challenges, we introduce the first standardized platform for multi-institutional collaboration on human manipulation data, unifying data formats, semantic annotations, and experimental protocols. Leveraging this framework, we construct a large-scale egocentric manipulation dataset comprising 1,965 tasks, 2,087 demonstrators, and 1,362 hours (80k clips) of demonstrations. Using this resource, we demonstrate effective policy transfer and evaluation across laboratories, tasks, and robotic platforms. Our experiments show that high-quality, well-aligned human data significantly enhances robotic policy performance, thereby advancing reproducible, data-driven research in robot learning.
📝 Abstract
Robot learning increasingly depends on large and diverse data, yet robot data collection remains expensive and difficult to scale. Egocentric human data offer a promising alternative by capturing rich manipulation behavior across everyday environments. However, existing human datasets are often limited in scope, difficult to extend, and fragmented across institutions. We introduce EgoVerse, a collaborative platform for human data-driven robot learning that unifies data collection, processing, and access under a shared framework, enabling contributions from individual researchers, academic labs, and industry partners. The current release includes 1,362 hours (80k episodes) of human demonstrations spanning 1,965 tasks, 240 scenes, and 2,087 unique demonstrators, with standardized formats, manipulation-relevant annotations, and tooling for downstream learning. Beyond the dataset, we conduct a large-scale study of human-to-robot transfer with experiments replicated across multiple labs, tasks, and robot embodiments under shared protocols. We find that policy performance generally improves with increased human data, but that effective scaling depends on alignment between human data and robot learning objectives. Together, the dataset, platform, and study establish a foundation for reproducible progress in human data-driven robot learning. Videos and additional information can be found at https://egoverse.ai/