AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordance

📅 2025-03-10

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This paper addresses the open-set, language-guided dexterous grasping problem—specifically, zero-shot semantic understanding and action generation for unseen object categories. We propose Generalizable-Instructive Affordance (GIA), a functional representation grounded in local geometry and category-agnostic attributes, enabling robust generalization across novel objects and tasks. Methodologically, we introduce a novel dual-stream generative framework: Affordance Flow Matching (AFM) and Grasp Flow Matching (GFM), integrating geometry-aware representation learning, conditional flow matching, and multimodal language–action alignment. Our approach is trained jointly on open-set synthetic and real-world data. Evaluated on a newly constructed open-set benchmark and deployed on a physical dexterous robot platform, our method achieves state-of-the-art performance—demonstrating significantly higher success rates and strong cross-category generalization in language-driven dexterous grasping.

Technology Category

Application Category

📝 Abstract

Language-guided robot dexterous generation enables robots to grasp and manipulate objects based on human commands. However, previous data-driven methods are hard to understand intention and execute grasping with unseen categories in the open set. In this work, we explore a new task, Open-set Language-guided Dexterous Grasp, and find that the main challenge is the huge gap between high-level human language semantics and low-level robot actions. To solve this problem, we propose an Affordance Dexterous Grasp (AffordDexGrasp) framework, with the insight of bridging the gap with a new generalizable-instructive affordance representation. This affordance can generalize to unseen categories by leveraging the object's local structure and category-agnostic semantic attributes, thereby effectively guiding dexterous grasp generation. Built upon the affordance, our framework introduces Affordacne Flow Matching (AFM) for affordance generation with language as input, and Grasp Flow Matching (GFM) for generating dexterous grasp with affordance as input. To evaluate our framework, we build an open-set table-top language-guided dexterous grasp dataset. Extensive experiments in the simulation and real worlds show that our framework surpasses all previous methods in open-set generalization.

Problem

Research questions and friction points this paper is trying to address.

Bridging gap between human language and robot actions

Generalizing grasp to unseen object categories

Enhancing dexterous grasp with affordance representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalizable-instructive affordance representation bridges language-robot gap

Affordance Flow Matching generates affordance from language input

Grasp Flow Matching creates dexterous grasps using affordance input

🔎 Similar Papers

No similar papers found.