Is Diversity All You Need for Scalable Robotic Manipulation?

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The role of data diversity in robotic manipulation remains poorly understood, and the empirical intuition that “more diversity is always better” lacks rigorous validation. This work systematically disentangles three orthogonal dimensions of diversity—task, embodiment, and expert—and investigates their effects via multi-platform experiments within a pretraining-finetuning framework. We find that task diversity significantly improves cross-task transfer performance; single-embodiment pretraining enhances cross-platform generalization; and expert diversity introduces velocity-modality ambiguity that degrades policy learning. To address this, we propose a distribution debiasing method that mitigates velocity ambiguity, yielding a 15% performance gain on the GO-1-Pro model—equivalent to scaling pretraining data by 2.5×. Our findings establish a high-quality data curation paradigm: prioritizing high task diversity while restricting embodiment to a single platform. This provides a reproducible, empirically grounded methodology for designing large-scale robotic manipulation datasets.

Technology Category

Application Category

📝 Abstract
Data scaling has driven remarkable success in foundation models for Natural Language Processing (NLP) and Computer Vision (CV), yet the principles of effective data scaling in robotic manipulation remain insufficiently understood. In this work, we investigate the nuanced role of data diversity in robot learning by examining three critical dimensions-task (what to do), embodiment (which robot to use), and expert (who demonstrates)-challenging the conventional intuition of "more diverse is better". Throughout extensive experiments on various robot platforms, we reveal that (1) task diversity proves more critical than per-task demonstration quantity, benefiting transfer from diverse pre-training tasks to novel downstream scenarios; (2) multi-embodiment pre-training data is optional for cross-embodiment transfer-models trained on high-quality single-embodiment data can efficiently transfer to different platforms, showing more desirable scaling property during fine-tuning than multi-embodiment pre-trained models; and (3) expert diversity, arising from individual operational preferences and stochastic variations in human demonstrations, can be confounding to policy learning, with velocity multimodality emerging as a key contributing factor. Based on this insight, we propose a distribution debiasing method to mitigate velocity ambiguity, the yielding GO-1-Pro achieves substantial performance gains of 15%, equivalent to using 2.5 times pre-training data. Collectively, these findings provide new perspectives and offer practical guidance on how to scale robotic manipulation datasets effectively.
Problem

Research questions and friction points this paper is trying to address.

Role of data diversity in robotic manipulation learning
Impact of task, embodiment, expert diversity on robot performance
Debiasing method to mitigate velocity ambiguity in policies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Task diversity enhances transfer to novel scenarios
Single-embodiment data enables efficient cross-platform transfer
Distribution debiasing mitigates velocity ambiguity in policies
🔎 Similar Papers
No similar papers found.
Modi Shi
Modi Shi
Beihang University
embodied ai
L
Li Chen
The University of Hong Kong, Shanghai AI Lab
J
Jin Chen
Shanghai Innovation Institute
Y
Yuxiang Lu
AgiBot
C
Chiming Liu
AgiBot
G
Guanghui Ren
AgiBot
Ping Luo
Ping Luo
National University of Defense Technology
distributed_computing
D
Di Huang
Beihang University
Maoqing Yao
Maoqing Yao
Google
H
Hongyang Li
The University of Hong Kong