Reactive In-Air Clothing Manipulation with Confidence-Aware Dense Correspondence and Visuotactile Affordance

๐Ÿ“… 2025-09-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the real-time aerial manipulation challenge of highly self-occluded, crumpled, and suspended garments, this paper proposes a bimanual, confidence-aware manipulation framework. Methodologically: (1) a dense visual correspondence model trained with distributional loss enables robust inter-frame feature matching; (2) a tactile self-supervised visionโ€“tactile grasp accessibility network jointly predicts graspable regions and associated uncertainty; (3) a perception-confidence-driven reactive state machine supports task-agnostic grasp selection and cross-modal policy transfer. Evaluated in both simulation and real-world settings, the system achieves, for the first time, stable folding and hanging of severely occluded, suspended clothing. Moreover, it generalizes grasp targets directly from human demonstration videos. Experiments demonstrate significant improvements in dynamic adaptability, robustness to occlusion and deformation, and cross-task generalization capability.

Technology Category

Application Category

๐Ÿ“ Abstract
Manipulating clothing is challenging due to complex configurations, variable material dynamics, and frequent self-occlusion. Prior systems often flatten garments or assume visibility of key features. We present a dual-arm visuotactile framework that combines confidence-aware dense visual correspondence and tactile-supervised grasp affordance to operate directly on crumpled and suspended garments. The correspondence model is trained on a custom, high-fidelity simulated dataset using a distributional loss that captures cloth symmetries and generates correspondence confidence estimates. These estimates guide a reactive state machine that adapts folding strategies based on perceptual uncertainty. In parallel, a visuotactile grasp affordance network, self-supervised using high-resolution tactile feedback, determines which regions are physically graspable. The same tactile classifier is used during execution for real-time grasp validation. By deferring action in low-confidence states, the system handles highly occluded table-top and in-air configurations. We demonstrate our task-agnostic grasp selection module in folding and hanging tasks. Moreover, our dense descriptors provide a reusable intermediate representation for other planning modalities, such as extracting grasp targets from human video demonstrations, paving the way for more generalizable and scalable garment manipulation.
Problem

Research questions and friction points this paper is trying to address.

Manipulating crumpled suspended garments with occlusion
Determining graspable regions using visuotactile feedback
Handling perceptual uncertainty through confidence-aware correspondence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-arm visuotactile framework with dense correspondence
Reactive state machine guided by confidence estimates
Self-supervised visuotactile grasp affordance network
๐Ÿ”Ž Similar Papers
No similar papers found.
Neha Sunil
Neha Sunil
MIT
M
Megha Tippur
Massachusetts Institute of Technology
A
Arnau Saumell
Prosper AI
E
Edward Adelson
Massachusetts Institute of Technology
Alberto Rodriguez
Alberto Rodriguez
Director of Robot Behavior --- Atlas, Boston Dynamics
RoboticsRobotic ManipulationDexterous ManipulationGrasping