🤖 AI Summary
Multi-fingered dexterous hands suffer from poor generalizability in cross-morphology grasping generation due to heavy reliance on morphology-specific training data.
Method: We propose a morphology-aware end-to-end grasping generation framework. A hand morphology encoder constructs low-dimensional morphology embeddings; these, together with object point clouds and wrist pose, conditionally predict joint motion coefficients. To enable few-shot transfer and fast inference, we introduce a feature-based grasp basis space for dimensionality reduction, a morphology-adaptive architecture, and a kinematics-aware loss (KAL).
Contribution/Results: This work presents the first morphology-embedding-based universal grasping generator for cross-morphology deployment. On three unseen dexterous hands, simulation success rates reach 91.9% for novel objects (inference time <0.4 s per grasp). After few-shot adaptation, simulation and real-world success rates achieve 85.6% and 87.0%, respectively.
📝 Abstract
Dexterous grasping with multi-fingered hands remains challenging due to high-dimensional articulations and the cost of optimization-based pipelines. Existing end-to-end methods require training on large-scale datasets for specific hands, limiting their ability to generalize across different embodiments. We propose an eigengrasp-based, end-to-end framework for cross-embodiment grasp generation. From a hand's morphology description, we derive a morphology embedding and an eigengrasp set. Conditioned on these, together with the object point cloud and wrist pose, an amplitude predictor regresses articulation coefficients in a low-dimensional space, which are decoded into full joint articulations. Articulation learning is supervised with a Kinematic-Aware Articulation Loss (KAL) that emphasizes fingertip-relevant motions and injects morphology-specific structure. In simulation on unseen objects across three dexterous hands, our model attains a 91.9% average grasp success rate with less than 0.4 seconds inference per grasp. With few-shot adaptation to an unseen hand, it achieves 85.6% success on unseen objects in simulation, and real-world experiments on this few-shot generalized hand achieve an 87% success rate. The code and additional materials will be made available upon publication on our project website https://connor-zh.github.io/cross_embodiment_dexterous_grasping.