GenDexHand: Generative Simulation for Dexterous Hands

📅 2025-11-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Data scarcity severely hinders the development of dexterous-hand embodied intelligence, particularly due to the lack of scalable, trainable methods for generating large-scale manipulation tasks. To address this, we propose the first generative simulation framework tailored for high-DoF dexterous hands. Our method leverages vision-language models (VLMs) to drive closed-loop feedback optimization—dynamically adjusting object placement and scale to enhance scene realism—and incorporates subtask decomposition to enable sequential reinforcement learning training. Furthermore, VLMs perform semantic quality assessment and iterative refinement of generated tasks. This approach significantly improves simulation diversity and task plausibility, yielding a 42% increase in training efficiency and a 31% improvement in task success rate. To our knowledge, this is the first scalable, high-fidelity, semantically controllable paradigm for generating dexterous manipulation simulation data for embodied intelligence.

Technology Category

Application Category

📝 Abstract
Data scarcity remains a fundamental bottleneck for embodied intelligence. Existing approaches use large language models (LLMs) to automate gripper-based simulation generation, but they transfer poorly to dexterous manipulation, which demands more specialized environment design. Meanwhile, dexterous manipulation tasks are inherently more difficult due to their higher degrees of freedom. Massively generating feasible and trainable dexterous hand tasks remains an open challenge. To this end, we present GenDexHand, a generative simulation pipeline that autonomously produces diverse robotic tasks and environments for dexterous manipulation. GenDexHand introduces a closed-loop refinement process that adjusts object placements and scales based on vision-language model (VLM) feedback, substantially improving the average quality of generated environments. Each task is further decomposed into sub-tasks to enable sequential reinforcement learning, reducing training time and increasing success rates. Our work provides a viable path toward scalable training of diverse dexterous hand behaviors in embodied intelligence by offering a simulation-based solution to synthetic data generation. Our website: https://winniechen2002.github.io/GenDexHand/.
Problem

Research questions and friction points this paper is trying to address.

Addressing data scarcity in dexterous hand manipulation tasks
Generating feasible and trainable environments for high-DOF hands
Automating simulation creation to enable scalable reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative simulation pipeline for dexterous manipulation tasks
Closed-loop refinement using vision-language model feedback
Task decomposition enabling sequential reinforcement learning
🔎 Similar Papers
No similar papers found.
F
Feng Chen
The University of Hong Kong, Transcengram
Z
Zhuxiu Xu
Transcengram, Shanghai Jiao Tong University
Tianzhe Chu
Tianzhe Chu
The University of Hong Kong
Machine Learning
X
Xunzhe Zhou
The University of Hong Kong, Transcengram
L
Li Sun
The University of Hong Kong
Z
Zewen Wu
The University of Hong Kong
Shenghua Gao
Shenghua Gao
The University of Hong Kong
Computer visionPattern RecognitionMachine Learning
Z
Zhongyu Li
The Chinese University of Hong Kong
Yanchao Yang
Yanchao Yang
Assistant Professor, HKU; Stanford University; UCLA
Embodied AIComputer VisionMachine Learning
Y
Yi Ma
The University of Hong Kong, Transcengram