ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera

📅 2024-05-09
🏛️ IEEE International Conference on Robotics and Automation
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
Grasping transparent and specular objects in cluttered scenes remains challenging due to the failure of conventional depth cameras to recover reliable geometry. Method: This paper proposes the first end-to-end solution leveraging an RGB-D active stereo camera—specifically, a two-stage learned stereo network that directly reconstructs material-invariant geometric structure from raw infrared and RGB images, bypassing depth-map inpainting. The framework integrates domain-randomized synthetic data (GraspNet-1Billion), end-to-end 6-DoF grasp pose prediction, and a sim-to-real transfer pipeline. Contribution/Results: Without relying on pre-recovered depth priors, our method achieves over 90% real-world grasping success rate—significantly surpassing state-of-the-art approaches. Notably, it is the first to outperform the ideal point-cloud oracle baseline, thereby breaking the performance ceiling of existing methods.

Technology Category

Application Category

📝 Abstract
In this paper, we tackle the problem of grasping transparent and specular objects. This issue holds importance, yet it remains unsolved within the field of robotics due to failure of recover their accurate geometry by depth cameras. For the first time, we propose ASGrasp, a 6-DoF grasp detection network that uses an RGB-D active stereo camera. ASGrasp utilizes a two-layer learning-based stereo network for the purpose of transparent object reconstruction, enabling material-agnostic object grasping in cluttered environments. In contrast to existing RGB-D based grasp detection methods, which heavily depend on depth restoration networks and the quality of depth maps generated by depth cameras, our system distinguishes itself by its ability to directly utilize raw IR and RGB images for transparent object geometry reconstruction. We create an extensive synthetic dataset through domain randomization, which is based on GraspNet-1Billion. Our experiments demonstrate that ASGrasp can achieve over 90% success rate for generalizable transparent object grasping in both simulation and the real via seamless sim-to-real transfer. Our method significantly outperforms SOTA networks and even surpasses the performance upper bound set by perfect visible point cloud inputs. Project page: https://pku-epic.github.io/ASGrasp
Problem

Research questions and friction points this paper is trying to address.

Grasping transparent and specular objects in robotics
Reconstructing transparent object geometry using RGB-D stereo camera
Achieving material-agnostic grasping in cluttered environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses RGB-D active stereo camera
Two-layer learning-based stereo network
Directly utilizes raw IR and RGB
🔎 Similar Papers
No similar papers found.
J
Jun Shi
Samsung R&D Institute China-Beijing
Y
Yong A
Samsung R&D Institute China-Beijing
Yixiang Jin
Yixiang Jin
Samsung R&D Institute China - Beijing
RoboticsRobot LearningRobot Simulator
D
Dingzhe Li
Samsung R&D Institute China-Beijing
Haoyu Niu
Haoyu Niu
Fudan University
Data privacy
Z
Zhezhu Jin
Samsung R&D Institute China-Beijing
H
He Wang
CFCS, School of Computer Science, Peking University; Galbot; Beijing Academy of Artificial Intelligence (BAAI)