MonoDuo: Using One Robot Arm to Learn Bimanual Policies

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scarcity of real-world data in bimanual robotic policy learning by introducing a novel collaboration-swapping paradigm. The approach leverages human–single-arm robot collaboration to perform one side of a bimanual task, then swaps roles to collect interactive demonstrations from both agents. By integrating hand pose estimation, RGB-D perception, image inpainting, and kinematic modeling, the method synthesizes demonstrations tailored to a target bimanual robot for imitation learning. This framework enables, for the first time, efficient transfer from human–single-arm collaborative data to unseen bimanual robot configurations, supporting both zero-shot deployment and few-shot fine-tuning. Evaluated on five bimanual tasks, the method achieves up to 70% success in zero-shot settings and, with only 25 target-domain demonstrations, improves performance by 65–70% over training from scratch after fine-tuning.
📝 Abstract
Bimanual coordination is essential for many real-world manipulation tasks, yet learning bimanual robot policies is limited by the scarcity of bimanual robots and datasets. Single-arm robots, however, are widely available in research labs. Can we leverage them to train bimanual robot policies? We present MonoDuo, a framework for learning bimanual manipulation policies using single-arm robot demonstrations paired with human collaboration. MonoDuo collects data by teleoperating a single-arm robot to perform one side of a bimanual task while a human performs the other, then swapping roles to cover both sides. RGB-D observations from a wrist-mounted and fixed camera are augmented into synthetic demonstrations for target bimanual robots using state-of-the-art hand pose estimation, image and point cloud segmentation, and inpainting. These synthetic demonstrations, grounded in real robot kinematics, are used to train bimanual policies. We evaluate MonoDuo on five tasks: box lifting, backpack packing, cloth folding, jacket zipping, and plate handover. Compared to approaches relying solely on human bimanual videos, MonoDuo enables zero-shot deployment on unseen bimanual robot configurations, achieving success rates up to 70%. With only 25 target robot demonstrations, few-shot finetuning further boosts success rates by 65-70% over training from scratch, demonstrating MonoDuo's effectiveness in efficiently transferring knowledge from single-arm robot data to bimanual robot policies.
Problem

Research questions and friction points this paper is trying to address.

bimanual manipulation
robot learning
data scarcity
single-arm robot
policy transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

bimanual manipulation
single-arm robot
synthetic demonstration
zero-shot transfer
few-shot fine-tuning
🔎 Similar Papers
No similar papers found.
S
Sandeep Bajamahal
University of California, Berkeley
Lawrence Yunliang Chen
Lawrence Yunliang Chen
PhD Student, UC Berkeley
RoboticsMachine Learning
Toru Lin
Toru Lin
UC Berkeley
Z
Zehan Ma
University of California, Berkeley
J
Jitendra Malik
University of California, Berkeley
Ken Goldberg
Ken Goldberg
Professor, UC Berkeley and UCSF
RobotsRoboticsAutomationCollaborative Filtering