3DSGrasp: 3D Shape-Completion for Robotic Grasp

📅 2023-01-02
🏛️ IEEE International Conference on Robotics and Automation
📈 Citations: 23
Influential: 0
📄 PDF
🤖 AI Summary
In real-world robotic grasping scenarios, sparse-view point clouds often lead to incompleteness, causing failure in 6D grasp pose estimation. Method: This paper proposes the first Transformer-based encoder-decoder network for point cloud completion tailored to robotic grasping tasks. Its core innovation is the Offset-Attention layer, which jointly models object rigidity (pose invariance) and point-set permutation invariance to preserve geometric consistency. The framework enables end-to-end joint optimization of point cloud completion and 6D grasp pose estimation. Results: On multi-category incomplete point clouds, our method achieves state-of-the-art completion accuracy. When deployed on a physical robot platform, it significantly improves grasp success rates, demonstrating both effectiveness and practical applicability.
📝 Abstract
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset are available at: https://github.com/NunoDuarte/3DSGrasp.
Problem

Research questions and friction points this paper is trying to address.

Completes partial 3D point clouds for reliable robotic grasping
Predicts missing geometry to generate accurate grasp poses
Improves grasping success rates with pose-invariant completion network
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based encoder-decoder network for point cloud completion
Offset-Attention layer ensures geometric consistency and completion
Pose-invariant and permutation-invariant design for robust grasping
S
S. S. Mohammadi
Department of Marine, Electrical, Electronic and Telecommunications Engineering, University of Genoa, Italy
N
N. Duarte
Vislab, Institute for Systems and Robotics—Lisboa, Instituto Superior Técnico, Universidade de Lisboa, Portugal
D
D. Dimou
Vislab, Institute for Systems and Robotics—Lisboa, Instituto Superior Técnico, Universidade de Lisboa, Portugal
Y
Yiming Wang
Pattern Analysis & Computer Vision (PAVIS), Istituto Italiano di Tecnologia (IIT), Genoa, Italy
M
M. Taiana
Pattern Analysis & Computer Vision (PAVIS), Istituto Italiano di Tecnologia (IIT), Genoa, Italy
Pietro Morerio
Pietro Morerio
Researcher @ IIT
Computer VisionPattern RecognitionMachine LearningDeep LearningArtificial Intelligence
A
Atabak Dehban
Vislab, Institute for Systems and Robotics—Lisboa, Instituto Superior Técnico, Universidade de Lisboa, Portugal
Plinio Moreno
Plinio Moreno
Institute for Systems and Robotics, Instituto Superior Tecnico (ISR/IST), LARSyS, Univ Lisboa
object manipulationcomputer visionroboticsmachine learninghuman activity recognition
Alexandre Bernardino
Alexandre Bernardino
Institute for Systems and Robotics (ISR/IST), LARSyS, Instituto Superior Técnico, Univ Lisboa
Computer VisionRobotics
A
A. D. Bue
Pattern Analysis & Computer Vision (PAVIS), Istituto Italiano di Tecnologia (IIT), Genoa, Italy
J
J. Santos-Victor
Vislab, Institute for Systems and Robotics—Lisboa, Instituto Superior Técnico, Universidade de Lisboa, Portugal