DexWild: Dexterous Human Interactions for In-the-Wild Robot Policies

📅 2025-05-12

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

Weak generalization of dexterous manipulation policies to novel environments, tasks, and robot embodiments—coupled with high cost and limited scale of real-world robotic data collection—hinders practical deployment. To address this, we propose DexWild-System: a crowdsourced, field-deployable human hand interaction data acquisition framework that integrates low-power mobile hand pose estimation with multi-source heterogeneous data alignment. Furthermore, we introduce a human-robot co-training framework featuring joint representation learning from human and robot demonstrations, augmented by an embodiment-agnostic transfer adaptation module, substantially reducing reliance on costly robot teleoperation data. Experiments demonstrate that our method achieves a 68.5% task success rate in unseen environments—3.9× higher than training solely on robot demonstration data—and improves cross-embodiment generalization by 5.8×.

Technology Category

Application Category

📝 Abstract

Large-scale, diverse robot datasets have emerged as a promising path toward enabling dexterous manipulation policies to generalize to novel environments, but acquiring such datasets presents many challenges. While teleoperation provides high-fidelity datasets, its high cost limits its scalability. Instead, what if people could use their own hands, just as they do in everyday life, to collect data? In DexWild, a diverse team of data collectors uses their hands to collect hours of interactions across a multitude of environments and objects. To record this data, we create DexWild-System, a low-cost, mobile, and easy-to-use device. The DexWild learning framework co-trains on both human and robot demonstrations, leading to improved performance compared to training on each dataset individually. This combination results in robust robot policies capable of generalizing to novel environments, tasks, and embodiments with minimal additional robot-specific data. Experimental results demonstrate that DexWild significantly improves performance, achieving a 68.5% success rate in unseen environments-nearly four times higher than policies trained with robot data only-and offering 5.8x better cross-embodiment generalization. Video results, codebases, and instructions at https://dexwild.github.io

Problem

Research questions and friction points this paper is trying to address.

Enabling dexterous robot policies to generalize to novel environments

Reducing cost and scalability issues in robot dataset collection

Improving robot performance with human and robot co-training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-cost mobile device for human data collection

Co-training on human and robot demonstrations

Improved generalization to novel environments

🔎 Similar Papers

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset