Learning Bimanual Cloth Manipulation with Vision-based Tactile Sensing via Single Robotic Arm

๐Ÿ“… 2026-03-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

210K/year
๐Ÿค– AI Summary
This work addresses the challenges of high-dimensional state spaces, deformability, and visual occlusion in fabric manipulation by proposing a single-arm bimanual cloth handling approach that circumvents the hardware and control complexities of dual-arm systems. The key innovations include a compact visionโ€“tactile fused gripper, a single-arm bimanual manipulation strategy, and a framework integrating Vision Transformers (PC-Net/PE-Net) for cloth region classification and edge pose estimation. To reduce annotation costs, the method leverages SD-Net to generate high-fidelity synthetic tactile data. Experimental results demonstrate that the system achieves 96% region recognition accuracy, sub-millimeter edge localization precision, and a directional error of only 4.5ยฐ, enabling efficient autonomous unfolding of crumpled fabrics in real-world scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
Robotic cloth manipulation remains challenging due to the high-dimensional state space of fabrics, their deformable nature, and frequent occlusions that limit vision-based sensing. Although dual-arm systems can mitigate some of these issues, they increase hardware and control complexity. This paper presents Touch G.O.G., a compact vision-based tactile gripper and perception/control framework for single-arm bimanual cloth manipulation. The proposed framework combines three key components: (1) a novel gripper design and control strategy for in-gripper cloth sliding with a single robot arm, (2) a Vision Foundation Model-backboned Vision Transformer pipeline for cloth part classification (PC-Net) and edge pose estimation (PE-Net) using real and synthetic tactile images, and (3) an encoder-decoder synthetic data generator (SD-Net) that reduces manual annotation by producing high-fidelity tactile images. Experiments show 96% accuracy in distinguishing edges, corners, interior regions, and grasp failures, together with sub-millimeter edge localization and 4.5{\deg} orientation error. Real-world results demonstrate reliable cloth unfolding, even for crumpled fabrics, using only a single robotic arm. These results highlight Touch G.O.G. as a compact and cost-effective solution for deformable object manipulation.
Problem

Research questions and friction points this paper is trying to address.

bimanual cloth manipulation
vision-based tactile sensing
deformable object manipulation
single robotic arm
cloth unfolding
Innovation

Methods, ideas, or system contributions that make the work stand out.

vision-based tactile sensing
single-arm bimanual manipulation
cloth manipulation
synthetic data generation
Vision Transformer