GraspClutter6D: A Large-scale Real-world Dataset for Robust Perception and Grasping in Cluttered Scenes

📅 2025-04-09

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

To address the insufficient robustness of robotic grasping in cluttered environments, this work introduces the first large-scale, real-world grasping dataset specifically designed for highly cluttered scenes. The dataset comprises 1,000 densely stacked scenes (averaging 14.1 objects per scene with a 62.6% occlusion rate), 200 object categories, 75 environmental configurations, and multi-view RGB-D imagery—marking the first systematic coverage of high-density occlusion, diverse environmental setups, and multi-view perception. It provides 736K annotated 6D object poses and 9.3 billion feasible 6D grasp configurations, significantly surpassing existing benchmarks in scale and complexity. Data acquisition employs a synchronized four-camera system, and annotations integrate multi-view calibration with manual verification. Experiments demonstrate that models trained on this dataset achieve substantial improvements in generalization performance for semantic segmentation, 6D pose estimation, and 6D grasp detection—both in simulation and on physical robotic platforms.

Technology Category

Application Category

📝 Abstract

Robust grasping in cluttered environments remains an open challenge in robotics. While benchmark datasets have significantly advanced deep learning methods, they mainly focus on simplistic scenes with light occlusion and insufficient diversity, limiting their applicability to practical scenarios. We present GraspClutter6D, a large-scale real-world grasping dataset featuring: (1) 1,000 highly cluttered scenes with dense arrangements (14.1 objects/scene, 62.6% occlusion), (2) comprehensive coverage across 200 objects in 75 environment configurations (bins, shelves, and tables) captured using four RGB-D cameras from multiple viewpoints, and (3) rich annotations including 736K 6D object poses and 9.3B feasible robotic grasps for 52K RGB-D images. We benchmark state-of-the-art segmentation, object pose estimation, and grasping detection methods to provide key insights into challenges in cluttered environments. Additionally, we validate the dataset's effectiveness as a training resource, demonstrating that grasping networks trained on GraspClutter6D significantly outperform those trained on existing datasets in both simulation and real-world experiments. The dataset, toolkit, and annotation tools are publicly available on our project website: https://sites.google.com/view/graspclutter6d.

Problem

Research questions and friction points this paper is trying to address.

Robust grasping in cluttered environments remains challenging for robotics

Existing datasets lack diversity and realism for practical robotic scenarios

GraspClutter6D provides large-scale real-world data for cluttered scene perception

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale real-world dataset for cluttered scenes

Multiple RGB-D cameras capture diverse viewpoints

Rich annotations for robust grasping training

🔎 Similar Papers

Multi-fingered Robotic Hand Grasping in Cluttered Environments through Hand-object Contact Semantic Mapping

2024-04-12arXiv.orgCitations: 7

Bosch Group

Attraktive Vergütung

Horb am Neckar, BW, DE

Research Scientist, Sensor and Systems Robotics (PhD)