MultiGraspNet: A Multitask 3D Vision Model for Multi-gripper Robotic Grasping

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a unified multi-task 3D vision model for robotic grasping that overcomes the limitations of existing methods, which are typically confined to a single gripper type or rely on custom hybrid mechanisms and thus lack generalizability across multiple end-effectors. By integrating gripper-specific refinement modules on top of shared early-stage features, the model simultaneously predicts feasible grasp poses for both parallel-jaw grippers and vacuum suction cups, enabling cross-modal information complementarity. Trained on aligned GraspNet-1Billion and SuctionNet-1Billion datasets, the approach achieves significant performance gains on a real single-arm multi-gripper platform: grasp success rates improve by 16% on seen objects and by 32% on novel objects, while matching the accuracy of specialized single-task models in parallel-jaw tasks, thereby substantially enhancing robustness and generalization in cluttered environments.

Technology Category

Application Category

📝 Abstract
Vision-based models for robotic grasping automate critical, repetitive, and draining industrial tasks. Existing approaches are typically limited in two ways: they either target a single gripper and are potentially applied on costly dual-arm setups, or rely on custom hybrid grippers that require ad-hoc learning procedures with logic that cannot be transferred across tasks, restricting their general applicability. In this work, we present MultiGraspNet, a novel multitask 3D deep learning method that predicts feasible poses simultaneously for parallel and vacuum grippers within a unified framework, enabling a single robot to handle multiple end effectors. The model is trained on the richly annotated GraspNet-1Billion and SuctionNet-1Billion datasets, which have been aligned for the purpose, and generates graspability masks quantifying the suitability of each scene point for successful grasps. By sharing early-stage features while maintaining gripper-specific refiners, MultiGraspNet effectively leverages complementary information across grasping modalities, enhancing robustness and adaptability in cluttered scenes. We characterize MultiGraspNet's performance with an extensive experimental analysis, demonstrating its competitiveness with single-task models on relevant benchmarks. We run real-world experiments on a single-arm multi-gripper robotic setup showing that our approach outperforms the vacuum baseline, grasping 16% percent more seen objects and 32% more of the novel ones, while obtaining competitive results for the parallel task.
Problem

Research questions and friction points this paper is trying to address.

robotic grasping
multi-gripper
vision-based models
general applicability
grasping modalities
Innovation

Methods, ideas, or system contributions that make the work stand out.

multitask learning
3D vision
multi-gripper grasping
grasp prediction
feature sharing
🔎 Similar Papers
No similar papers found.
S
Stephany Ortuno-Chanelo
VANDAL Laboratory, Department of Control and Computer Engineering, Politecnico di Torino, Turin, Italy.
P
Paolo Rabino
VANDAL Laboratory, Department of Control and Computer Engineering, Politecnico di Torino, Turin, Italy.
Enrico Civitelli
Enrico Civitelli
Machine Learning Engineer at Comau
Machine Learning
Tatiana Tommasi
Tatiana Tommasi
Politecnico di Torino
machine learningcomputer visionartificial intelligence
Raffaello Camoriano
Raffaello Camoriano
Assistant Professor, Politecnico di Torino; Affiliated Researcher, Istituto Italiano di Tecnologia
Robot LearningIncremental/Lifelong LearningStructured LearningKernel Methods