ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping

📅 2025-04-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Robotic grasping often suffers from suboptimal motion planning and collisions due to insufficient geometric scene modeling. To address this, we propose the first end-to-end framework jointly optimizing 3D reconstruction and grasp pose prediction, enabling zero-shot, real-time inference (≈30 FPS) from a single partial-view RGB-D frame. Methodologically, we introduce a novel co-optimization paradigm that tightly couples reconstruction and grasping, incorporate a grasp-oriented occlusion-aware reasoning mechanism, and construct a large-scale synthetic dataset comprising 1M images, 12K objects, and 11.3B valid grasp annotations. Our approach integrates multi-task deep learning, physics-constrained grasp sampling, and synthetic-data-driven generalization. Evaluated on GraspNet-1B, it achieves state-of-the-art performance; real-robot experiments further demonstrate strong zero-shot generalization to unseen objects, with high accuracy and real-time capability in both reconstruction and grasping.

Technology Category

Application Category

📝 Abstract
Robotic grasping is a cornerstone capability of embodied systems. Many methods directly output grasps from partial information without modeling the geometry of the scene, leading to suboptimal motion and even collisions. To address these issues, we introduce ZeroGrasp, a novel framework that simultaneously performs 3D reconstruction and grasp pose prediction in near real-time. A key insight of our method is that occlusion reasoning and modeling the spatial relationships between objects is beneficial for both accurate reconstruction and grasping. We couple our method with a novel large-scale synthetic dataset, which comprises 1M photo-realistic images, high-resolution 3D reconstructions and 11.3B physically-valid grasp pose annotations for 12K objects from the Objaverse-LVIS dataset. We evaluate ZeroGrasp on the GraspNet-1B benchmark as well as through real-world robot experiments. ZeroGrasp achieves state-of-the-art performance and generalizes to novel real-world objects by leveraging synthetic data.
Problem

Research questions and friction points this paper is trying to address.

Simultaneous 3D reconstruction and grasp pose prediction
Addressing suboptimal motion and collisions in robotic grasping
Generalizing to novel objects using synthetic data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simultaneous 3D reconstruction and grasp prediction
Occlusion reasoning for spatial relationship modeling
Large-scale synthetic dataset for training
🔎 Similar Papers
No similar papers found.