End-to-End Dexterous Grasp Learning from Single-View Point Clouds via a Multi-Object Scene Dataset

📅 2026-03-16

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the challenges of multi-object dexterous grasping, which are hindered by environmental interference and inadequate modeling of pre-grasp poses. Existing datasets are often limited to single objects and predefined configurations, resulting in poor generalization. To overcome these limitations, the authors propose DGS-Net, an end-to-end network that directly predicts dense dexterous grasp configurations from a single-view point cloud. They also introduce a new dataset comprising 307 objects, 240 scenes, and over 350,000 valid grasps. By employing an innovative two-stage grasp generation strategy that explicitly models grasp offsets and pre-grasp poses—combined with dense synthesis and scene-level validation—the method significantly enhances robustness and accuracy in complex scenarios. Experiments demonstrate success rates of 88.63% in simulation and 78.98% on a physical robot, with an average penetration depth of only 0.375 mm, substantially outperforming existing approaches.

Technology Category

Application Category

📝 Abstract

Dexterous grasping in multi-object scene constitutes a fundamental challenge in robotic manipulation. Current mainstream grasping datasets predominantly focus on single-object scenarios and predefined grasp configurations, often neglecting environmental interference and the modeling of dexterous pre-grasp gesture, thereby limiting their generalizability in real-world applications. To address this, we propose DGS-Net, an end-to-end grasp prediction network capable of learning dense grasp configurations from single-view point clouds in multi-object scene. Furthermore, we propose a two-stage grasp data generation strategy that progresses from dense single-object grasp synthesis to dense scene-level grasp generation. Our dataset comprises 307 objects, 240 multi-object scenes, and over 350k validated grasps. By explicitly modeling grasp offsets and pre-grasp configurations, the dataset provides more robust and accurate supervision for dexterous grasp learning. Experimental results show that DGS-Net achieves grasp success rates of 88.63\% in simulation and 78.98\% on a real robotic platform, while exhibiting lower penetration with a mean penetration depth of 0.375 mm and penetration volume of 559.45 mm^3, outperforming existing methods and demonstrating strong effectiveness and generalization capability. Our dataset is available at https://github.com/4taotao8/DGS-Net.

Problem

Research questions and friction points this paper is trying to address.

dexterous grasping

multi-object scene

grasp generalization

environmental interference

pre-grasp configuration

Innovation

Methods, ideas, or system contributions that make the work stand out.

dexterous grasping

multi-object scene

end-to-end grasp learning