🤖 AI Summary
Existing bimanual dexterous teleoperation systems lack a unified, reproducible, and fair benchmark for evaluation; heterogeneous hardware interfaces—such as motion-capture gloves, exoskeletons, VR, and monocular vision—and diverse task requirements impede meaningful performance comparison.
Method: We introduce the first simulation-centric benchmark specifically designed for bimanual dexterous teleoperation, comprising 30 high-fidelity physics-based task environments and supporting standardized evaluation across four mainstream teleoperation modalities.
Contribution/Results: Our benchmark establishes an externally valid, general-purpose evaluation framework through multimodal sensor modeling (IMU, VR, exoskeleton, vision), force-motion coupled task design, and a unified evaluation protocol. It achieves strong performance correlation (r > 0.92) between simulation and real-world bimanual 6-DoF dexterous hand platforms on 10 held-out tasks. This benchmark provides a reproducible, scalable, and standardized platform for both algorithm development and hardware iteration.
📝 Abstract
Teleoperation is a cornerstone of embodied-robot learning, and bimanual dexterous teleoperation in particular provides rich demonstrations that are difficult to obtain with fully autonomous systems. While recent studies have proposed diverse hardware pipelines-ranging from inertial motion-capture gloves to exoskeletons and vision-based interfaces-there is still no unified benchmark that enables fair, reproducible comparison of these systems. In this paper, we introduce TeleOpBench, a simulator-centric benchmark tailored to bimanual dexterous teleoperation. TeleOpBench contains 30 high-fidelity task environments that span pick-and-place, tool use, and collaborative manipulation, covering a broad spectrum of kinematic and force-interaction difficulty. Within this benchmark we implement four representative teleoperation modalities-(i) MoCap, (ii) VR device, (iii) arm-hand exoskeletons, and (iv) monocular vision tracking-and evaluate them with a common protocol and metric suite. To validate that performance in simulation is predictive of real-world behavior, we conduct mirrored experiments on a physical dual-arm platform equipped with two 6-DoF dexterous hands. Across 10 held-out tasks we observe a strong correlation between simulator and hardware performance, confirming the external validity of TeleOpBench. TeleOpBench establishes a common yardstick for teleoperation research and provides an extensible platform for future algorithmic and hardware innovation.