🤖 AI Summary
This work addresses the lack of systematic evaluation standards for multi-object grasping and manipulation. We propose MOGB, the first standardized benchmark for multi-object grasping in both stacked and planar scenes, featuring three core protocols: (1) one-shot grasping (efficiency), (2) precise selective grasping and relocation (accuracy), and (3) complete scene clearing (robustness). MOGB supports diverse end-effectors—including Barrett Hand, Robotiq 2F-85, and Pisa/IIT SoftHand-2—and integrates 3D perception with motion planning modules to enable reproducible baselines and quantitative cross-platform performance comparison. Crucially, it is the first framework to jointly evaluate efficiency, accuracy, and robustness under unified conditions, while incorporating human performance as a reference benchmark. MOGB provides an extensible, reproducible, and platform-agnostic evaluation standard for robotic manipulation research.
📝 Abstract
In this work, we describe a multi-object grasping benchmark to evaluate the grasping and manipulation capabilities of robotic systems in both pile and surface scenarios. The benchmark introduces three robot multi-object grasping benchmarking protocols designed to challenge different aspects of robotic manipulation. These protocols are: 1) the Only-Pick-Once protocol, which assesses the robot's ability to efficiently pick multiple objects in a single attempt; 2) the Accurate pick-trnsferring protocol, which evaluates the robot's capacity to selectively grasp and transport a specific number of objects from a cluttered environment; and 3) the Pick-transferring-all protocol, which challenges the robot to clear an entire scene by sequentially grasping and transferring all available objects. These protocols are intended to be adopted by the broader robotics research community, providing a standardized method to assess and compare robotic systems' performance in multi-object grasping tasks. We establish baselines for these protocols using standard planning and perception algorithms on a Barrett hand, Robotiq parallel jar gripper, and the Pisa/IIT Softhand-2, which is a soft underactuated robotic hand. We discuss the results in relation to human performance in similar tasks we well.