Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework

📅 2025-03-12

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

In dual-arm robotic manipulation, existing end-to-end models enforce early bilateral collaboration, compromising generalization across both coordinated tasks (e.g., bimanual object transport) and uncoordinated ones (e.g., unilateral local grasping). To address this, we propose a decoupled interaction framework: (1) independent policy networks per arm for efficient learning of uncoordinated actions; and (2) an adaptive selective interaction module with learnable gating to dynamically fuse arm-specific features only when coordination is required. This work introduces the first paradigm of *separated learning pathways* for coordinated versus uncoordinated tasks. Evaluated on seven tasks in the RoboTwin benchmark, our method achieves a 23.5% absolute success rate improvement over state-of-the-art methods, while using only 1/6 the parameters—yielding a 16.5% gain from parameter efficiency alone. Furthermore, the framework supports seamless integration and multi-agent extension, delivering a 28% performance boost in multi-agent scenarios.

Technology Category

Application Category

📝 Abstract

Bimanual robotic manipulation is an emerging and critical topic in the robotics community. Previous works primarily rely on integrated control models that take the perceptions and states of both arms as inputs to directly predict their actions. However, we think bimanual manipulation involves not only coordinated tasks but also various uncoordinated tasks that do not require explicit cooperation during execution, such as grasping objects with the closest hand, which integrated control frameworks ignore to consider due to their enforced cooperation in the early inputs. In this paper, we propose a novel decoupled interaction framework that considers the characteristics of different tasks in bimanual manipulation. The key insight of our framework is to assign an independent model to each arm to enhance the learning of uncoordinated tasks, while introducing a selective interaction module that adaptively learns weights from its own arm to improve the learning of coordinated tasks. Extensive experiments on seven tasks in the RoboTwin dataset demonstrate that: (1) Our framework achieves outstanding performance, with a 23.5% boost over the SOTA method. (2) Our framework is flexible and can be seamlessly integrated into existing methods. (3) Our framework can be effectively extended to multi-agent manipulation tasks, achieving a 28% boost over the integrated control SOTA. (4) The performance boost stems from the decoupled design itself, surpassing the SOTA by 16.5% in success rate with only 1/6 of the model size.

Problem

Research questions and friction points this paper is trying to address.

Addresses limitations of integrated control in bimanual robotic manipulation.

Proposes a decoupled framework for both coordinated and uncoordinated tasks.

Enhances performance and flexibility in multi-agent manipulation tasks.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled interaction framework for bimanual manipulation

Independent models for each arm enhance uncoordinated tasks

Selective interaction module improves coordinated tasks learning

🔎 Similar Papers

No similar papers found.