ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera

📅 2024-05-09

🏛️ IEEE International Conference on Robotics and Automation

📈 Citations: 4

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Grasping transparent and specular objects in cluttered scenes remains challenging due to the failure of conventional depth cameras to recover reliable geometry. Method: This paper proposes the first end-to-end solution leveraging an RGB-D active stereo camera—specifically, a two-stage learned stereo network that directly reconstructs material-invariant geometric structure from raw infrared and RGB images, bypassing depth-map inpainting. The framework integrates domain-randomized synthetic data (GraspNet-1Billion), end-to-end 6-DoF grasp pose prediction, and a sim-to-real transfer pipeline. Contribution/Results: Without relying on pre-recovered depth priors, our method achieves over 90% real-world grasping success rate—significantly surpassing state-of-the-art approaches. Notably, it is the first to outperform the ideal point-cloud oracle baseline, thereby breaking the performance ceiling of existing methods.

Technology Category

Application Category

📝 Abstract

In this paper, we tackle the problem of grasping transparent and specular objects. This issue holds importance, yet it remains unsolved within the field of robotics due to failure of recover their accurate geometry by depth cameras. For the first time, we propose ASGrasp, a 6-DoF grasp detection network that uses an RGB-D active stereo camera. ASGrasp utilizes a two-layer learning-based stereo network for the purpose of transparent object reconstruction, enabling material-agnostic object grasping in cluttered environments. In contrast to existing RGB-D based grasp detection methods, which heavily depend on depth restoration networks and the quality of depth maps generated by depth cameras, our system distinguishes itself by its ability to directly utilize raw IR and RGB images for transparent object geometry reconstruction. We create an extensive synthetic dataset through domain randomization, which is based on GraspNet-1Billion. Our experiments demonstrate that ASGrasp can achieve over 90% success rate for generalizable transparent object grasping in both simulation and the real via seamless sim-to-real transfer. Our method significantly outperforms SOTA networks and even surpasses the performance upper bound set by perfect visible point cloud inputs. Project page: https://pku-epic.github.io/ASGrasp

Problem

Research questions and friction points this paper is trying to address.

Grasping transparent and specular objects in robotics

Reconstructing transparent object geometry using RGB-D stereo camera

Achieving material-agnostic grasping in cluttered environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses RGB-D active stereo camera

Two-layer learning-based stereo network

Directly utilizes raw IR and RGB

🔎 Similar Papers

ClearDepth: Enhanced Stereo Perception of Transparent Objects for Robotic Manipulation

2024-09-13arXiv.orgCitations: 1

💼 Related Jobs

3D Computer Vision Researcher

Kitware

Arlington, Virginia

Research Scientist Intern, Machine Perception for Input and Interaction (PhD)