Reactive In-Air Clothing Manipulation with Confidence-Aware Dense Correspondence and Visuotactile Affordance

📅 2025-09-04

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

To address the real-time aerial manipulation challenge of highly self-occluded, crumpled, and suspended garments, this paper proposes a bimanual, confidence-aware manipulation framework. Methodologically: (1) a dense visual correspondence model trained with distributional loss enables robust inter-frame feature matching; (2) a tactile self-supervised vision–tactile grasp accessibility network jointly predicts graspable regions and associated uncertainty; (3) a perception-confidence-driven reactive state machine supports task-agnostic grasp selection and cross-modal policy transfer. Evaluated in both simulation and real-world settings, the system achieves, for the first time, stable folding and hanging of severely occluded, suspended clothing. Moreover, it generalizes grasp targets directly from human demonstration videos. Experiments demonstrate significant improvements in dynamic adaptability, robustness to occlusion and deformation, and cross-task generalization capability.

Technology Category

Application Category

📝 Abstract

Manipulating clothing is challenging due to complex configurations, variable material dynamics, and frequent self-occlusion. Prior systems often flatten garments or assume visibility of key features. We present a dual-arm visuotactile framework that combines confidence-aware dense visual correspondence and tactile-supervised grasp affordance to operate directly on crumpled and suspended garments. The correspondence model is trained on a custom, high-fidelity simulated dataset using a distributional loss that captures cloth symmetries and generates correspondence confidence estimates. These estimates guide a reactive state machine that adapts folding strategies based on perceptual uncertainty. In parallel, a visuotactile grasp affordance network, self-supervised using high-resolution tactile feedback, determines which regions are physically graspable. The same tactile classifier is used during execution for real-time grasp validation. By deferring action in low-confidence states, the system handles highly occluded table-top and in-air configurations. We demonstrate our task-agnostic grasp selection module in folding and hanging tasks. Moreover, our dense descriptors provide a reusable intermediate representation for other planning modalities, such as extracting grasp targets from human video demonstrations, paving the way for more generalizable and scalable garment manipulation.

Problem

Research questions and friction points this paper is trying to address.

Manipulating crumpled suspended garments with occlusion

Determining graspable regions using visuotactile feedback

Handling perceptual uncertainty through confidence-aware correspondence

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-arm visuotactile framework with dense correspondence

Reactive state machine guided by confidence estimates

Self-supervised visuotactile grasp affordance network

🔎 Similar Papers

GARField: Addressing the Visual Sim-to-Real Gap in Garment Manipulation with Mesh-Attached Radiance Fields

2024-10-07IEEE International Conference on Robotics and BiomimeticsCitations: 1