Sim2Real Transfer for Vision-Based Grasp Verification

๐Ÿ“… 2025-05-05
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Addressing the challenge of verifying grasps on deformable objects, this paper proposes a purely vision-based two-stage approach: first localizing the robotic gripper using YOLOv8, then classifying grasp success via ResNet. Key contributions include: (1) introducing HSR-GraspSynthโ€”the first synthetic dataset specifically designed for grasp verification; (2) pioneering the use of Vision Question Answering (VQA) as a zero-shot baseline for this task; and (3) achieving efficient Sim2Real transfer, yielding high accuracy in real-world settings while enabling seamless integration into existing grasp pipelines. The method eliminates reliance on force or tactile sensors, significantly enhancing robustness for non-rigid objects. Both code and dataset are publicly released.

Technology Category

Application Category

๐Ÿ“ Abstract
The verification of successful grasps is a crucial aspect of robot manipulation, particularly when handling deformable objects. Traditional methods relying on force and tactile sensors often struggle with deformable and non-rigid objects. In this work, we present a vision-based approach for grasp verification to determine whether the robotic gripper has successfully grasped an object. Our method employs a two-stage architecture; first YOLO-based object detection model to detect and locate the robot's gripper and then a ResNet-based classifier determines the presence of an object. To address the limitations of real-world data capture, we introduce HSR-GraspSynth, a synthetic dataset designed to simulate diverse grasping scenarios. Furthermore, we explore the use of Visual Question Answering capabilities as a zero-shot baseline to which we compare our model. Experimental results demonstrate that our approach achieves high accuracy in real-world environments, with potential for integration into grasping pipelines. Code and datasets are publicly available at https://github.com/pauamargant/HSR-GraspSynth .
Problem

Research questions and friction points this paper is trying to address.

Vision-based grasp verification for deformable objects
Overcoming limitations of force and tactile sensors
Sim2Real transfer using synthetic dataset HSR-GraspSynth
Innovation

Methods, ideas, or system contributions that make the work stand out.

YOLO-based object detection for gripper localization
ResNet classifier for object presence verification
HSR-GraspSynth synthetic dataset for diverse scenarios
๐Ÿ”Ž Similar Papers
No similar papers found.
P
Pau Amargant
Polytechnic University of Catalonia
P
Peter Honig
Faculty of Electrical Engineering, Technical University of Vienna
Markus Vincze
Markus Vincze
TU Wien
Robot visionhome roboticsmaking robots see