Learning Granularity-Aware Affordances from Human-Object Interaction for Tool-Based Functional Grasping in Dexterous Robotics

📅 2024-06-30
🏛️ arXiv.org
📈 Citations: 7
Influential: 1
📄 PDF
🤖 AI Summary
To address the challenge of functional grasping in dexterous robotic tool manipulation, this paper proposes GAAF-Dex, an end-to-end framework that learns granularity-aware affordance features from human-object interactions to jointly localize functional contact regions (fine-grained) and predict dexterous grasp poses (coarse-grained). We introduce a novel weakly supervised cross-view learning paradigm—leveraging exocentric supervision to guide egocentric affordance estimation—and a force-feedback-driven coarse-to-fine grasp post-processing module. The method integrates multi-granularity affordance representation, functional finger coordinate localization, hand-to-end-effector coordinate transformation, and joint modeling of exocentric and egocentric images. Evaluated on our newly constructed FAH dataset (6K images, 18 tools, 6 task categories), GAAF-Dex significantly outperforms state-of-the-art methods in both functional region localization accuracy and dexterous gesture prediction. The code is publicly available.

Technology Category

Application Category

📝 Abstract
To enable robots to use tools, the initial step is teaching robots to employ dexterous gestures for touching specific areas precisely where tasks are performed. Affordance features of objects serve as a bridge in the functional interaction between agents and objects. However, leveraging these affordance cues to help robots achieve functional tool grasping remains unresolved. To address this, we propose a granularity-aware affordance feature extraction method for locating functional affordance areas and predicting dexterous coarse gestures. We study the intrinsic mechanisms of human tool use. On one hand, we use fine-grained affordance features of object-functional finger contact areas to locate functional affordance regions. On the other hand, we use highly activated coarse-grained affordance features in hand-object interaction regions to predict grasp gestures. Additionally, we introduce a model-based post-processing module that includes functional finger coordinate localization, finger-to-end coordinate transformation, and force feedback-based coarse-to-fine grasping. This forms a complete dexterous robotic functional grasping framework GAAF-Dex, which learns Granularity-Aware Affordances from human-object interaction for tool-based Functional grasping in Dexterous Robotics. Unlike fully-supervised methods that require extensive data annotation, we employ a weakly supervised approach to extract relevant cues from exocentric (Exo) images of hand-object interactions to supervise feature extraction in egocentric (Ego) images. We have constructed a small-scale dataset, FAH, which includes near 6K images of functional hand-object interaction Exo- and Ego images of 18 commonly used tools performing 6 tasks. Extensive experiments on the dataset demonstrate our method outperforms state-of-the-art methods. The code will be made publicly available at https://github.com/yangfan293/GAAF-DEX.
Problem

Research questions and friction points this paper is trying to address.

Teaching robots dexterous tool grasping using affordance features
Locating functional affordance areas via granularity-aware extraction
Predicting grasp gestures from hand-object interaction cues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Granularity-aware affordance feature extraction method
Weakly supervised learning from Exo and Ego images
Model-based post-processing for robotic action execution
F
Fan Yang
School of Robotics, Hunan University, Changsha 410012, China
Wenrui Chen
Wenrui Chen
Hunan University
RoboticsHandsGraspingDexterous ManipulationHuman-Robot Collaboration
Kailun Yang
Kailun Yang
Professor. School of Artificial Intelligence and Robotics, Hunan University (HNU); KIT; UAH; ZJU
Computer VisionComputational OpticsIntelligent VehiclesAutonomous DrivingRobotics
H
Haoran Lin
School of Robotics, Hunan University, Changsha 410012, China
D
DongSheng Luo
School of Robotics, Hunan University, Changsha 410012, China
C
Conghui Tang
School of Robotics, Hunan University, Changsha 410012, China
Zhiyong Li
Zhiyong Li
Professor of Computer Science, Hunan University
computer vision,object detection
Y
Yaonan Wang
School of Robotics, Hunan University, Changsha 410012, China