Grasp-HGN: Grasping the Unexpected

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Prosthetic hands for transradial amputees exhibit poor robustness and limited generalization when grasping unseen objects in real-world scenarios. Method: This work introduces a semantic projection mechanism and the Grasp-LLaVA vision-language model to enable cross-category grasp-type inference; further, it proposes an edge-cloud collaborative hybrid graph neural network (HGN) architecture integrating confidence calibration and dynamic inference path switching. Results: Experiments show a 50.2% grasp accuracy on unseen objects; the HGN with dynamic switching achieves 42.3% accuracy while accelerating inference by 3.5×; in real-world settings, average accuracy reaches 86%, outperforming pure edge-based approaches by 12.2%. This work significantly enhances the adaptability and practicality of prosthetic hands in open, unstructured environments.

Technology Category

Application Category

📝 Abstract

For transradial amputees, robotic prosthetic hands promise to regain the capability to perform daily living activities. To advance next-generation prosthetic hand control design, it is crucial to address current shortcomings in robustness to out of lab artifacts, and generalizability to new environments. Due to the fixed number of object to interact with in existing datasets, contrasted with the virtually infinite variety of objects encountered in the real world, current grasp models perform poorly on unseen objects, negatively affecting users' independence and quality of life. To address this: (i) we define semantic projection, the ability of a model to generalize to unseen object types and show that conventional models like YOLO, despite 80% training accuracy, drop to 15% on unseen objects. (ii) we propose Grasp-LLaVA, a Grasp Vision Language Model enabling human-like reasoning to infer the suitable grasp type estimate based on the object's physical characteristics resulting in a significant 50.2% accuracy over unseen object types compared to 36.7% accuracy of an SOTA grasp estimation model. Lastly, to bridge the performance-latency gap, we propose Hybrid Grasp Network (HGN), an edge-cloud deployment infrastructure enabling fast grasp estimation on edge and accurate cloud inference as a fail-safe, effectively expanding the latency vs. accuracy Pareto. HGN with confidence calibration (DC) enables dynamic switching between edge and cloud models, improving semantic projection accuracy by 5.6% (to 42.3%) with 3.5x speedup over the unseen object types. Over a real-world sample mix, it reaches 86% average accuracy (12.2% gain over edge-only), and 2.2x faster inference than Grasp-LLaVA alone.

Problem

Research questions and friction points this paper is trying to address.

Improving robustness of prosthetic hand control in real-world environments

Enhancing grasp model accuracy for unseen object types

Reducing latency while maintaining high accuracy in grasp estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Grasp-LLaVA enables human-like reasoning for grasp estimation

Hybrid Grasp Network combines edge-cloud deployment for efficiency

Dynamic switching improves accuracy and speed for unseen objects

🔎 Similar Papers

No similar papers found.