🤖 AI Summary
This paper addresses task-oriented human grasping synthesis by proposing a novel scene- and task-aware method. To explicitly encode task semantics alongside scene geometry and functional context, we introduce the Task-Aware Contact Graph—the first of its kind—and design a two-stage diffusion-based generation framework: first producing task-consistent coarse hand-object contact distributions, then refining them into physically feasible hand poses. We further construct T-Grasp, the first large-scale task-oriented grasping dataset, and propose new evaluation metrics—including Task Completion Rate—to quantitatively assess task alignment. Experiments demonstrate that our approach significantly outperforms existing methods in grasp plausibility, task adaptability, and interaction naturalness, validating the critical importance of jointly modeling task intent and scene constraints for high-fidelity grasping synthesis.
📝 Abstract
In this paper, we study task-oriented human grasp synthesis, a new grasp synthesis task that demands both task and context awareness. At the core of our method is the task-aware contact maps. Unlike traditional contact maps that only reason about the manipulated object and its relation with the hand, our enhanced maps take into account scene and task information. This comprehensive map is critical for hand-object interaction, enabling accurate grasping poses that align with the task. We propose a two-stage pipeline that first constructs a task-aware contact map informed by the scene and task. In the subsequent stage, we use this contact map to synthesize task-oriented human grasps. We introduce a new dataset and a metric for the proposed task to evaluate our approach. Our experiments validate the importance of modeling both scene and task, demonstrating significant improvements over existing methods in both grasp quality and task performance. See our project page for more details: https://hcis-lab.github.io/TOHGS/