MG-Grasp: Metric-Scale Geometric 6-DoF Grasping Framework with Sparse RGB Observations

πŸ“… 2026-03-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of generating physically plausible 6-DoF grasps from RGB inputs, a task hindered by the lack of precise geometric representation in existing methods. The authors propose a novel depth-sensor-free framework that leverages only two-view RGB images and known camera parameters to reconstruct a dense, metrically scaled, and multi-view consistent point cloud using a 3D foundation model. Stable grasp poses are then generated directly from this reconstructed geometry. To the best of the authors’ knowledge, this is the first approach to achieve high-quality, metrically accurate, and geometrically consistent grasping under sparse RGB observations, effectively overcoming the geometric fidelity limitations of conventional RGB-based methods. Experiments demonstrate state-of-the-art performance on both the GraspNet-1Billion benchmark and real-world scenarios among RGB-only 6-DoF grasping approaches.

Technology Category

Application Category

πŸ“ Abstract
Single-view RGB-D grasp detection remains a com- mon choice in 6-DoF robotic grasping systems, which typically requires a depth sensor. While RGB-only 6-DoF grasp methods has been studied recently, their inaccurate geometric repre- sentation is not directly suitable for physically reliable robotic manipulation, thereby hindering reliable grasp generation. To address these limitations, we propose MG-Grasp, a novel depth- free 6-DoF grasping framework that achieves high-quality object grasping. Leveraging two-view 3D foundation model with camera intrinsic/extrinsic, our method reconstructs metric- scale and multi-view consistent dense point clouds from sparse RGB images and generates stable 6-DoF grasp. Experiments on GraspNet-1Billion dataset and real world demonstrate that MG-Grasp achieves state-of-the-art (SOTA) grasp performance among RGB-based 6-DoF grasping methods.
Problem

Research questions and friction points this paper is trying to address.

6-DoF grasping
RGB-only
geometric representation
robotic manipulation
grasp reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Metric-scale reconstruction
6-DoF grasping
Sparse RGB observations
Depth-free manipulation
Multi-view consistent point cloud
πŸ”Ž Similar Papers
No similar papers found.
K
Kangxu Wang
The Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
S
Siang Chen
The Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
C
Chenxing Jiang
The Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Shaojie Shen
Shaojie Shen
Associate Professor, Hong Kong University of Science and Technology
Robotics
Y
Yixiang Dai
The Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Guijin Wang
Guijin Wang
tsinghua.edu.cn
computer vision3D imagingrobot manipulation