🤖 AI Summary
Task-oriented grasping (TOG) with high-DOF dexterous hands faces challenges in satisfying task constraints within a high-dimensional, multimodal solution space while interpreting natural language instructions. Method: We propose DexDiffu—the first end-to-end differentiable, language-conditioned diffusion model for dexterous grasping generation. It introduces language embeddings into grasp synthesis for the first time and leverages DexTOG-80K, a large-scale TOG dataset comprising 80 objects × 5 tasks, integrating language-action aligned representations, Shadow-hand-based simulation and real-world data, and task-driven grasp quality evaluation. Contribution/Results: DexDiffu achieves high success rates in both simulated and physical experiments, significantly improves cross-task generalization, and establishes a new benchmark for dexterous TOG—demonstrating robust language grounding, task compliance, and real-world deployability.
📝 Abstract
This study introduces a novel language-guided diffusion-based learning framework, DexTOG, aimed at advancing the field of task-oriented grasping (TOG) with dexterous hands. Unlike existing methods that mainly focus on 2-finger grippers, this research addresses the complexities of dexterous manipulation, where the system must identify non-unique optimal grasp poses under specific task constraints, cater to multiple valid grasps, and search in a high degree-of-freedom configuration space in grasp planning. The proposed DexTOG includes a diffusion-based grasp pose generation model, DexDiffu, and a data engine to support the DexDiffu. By leveraging DexTOG, we also proposed a new dataset, DexTOG-80K, which was developed using a shadow robot hand to perform various tasks on 80 objects from five categories, showcasing the dexterity and multi-tasking capabilities of the robotic hand. This research not only presents a significant leap in dexterous TOG but also provides a comprehensive dataset and simulation validation, setting a new benchmark in robotic manipulation research.