π€ AI Summary
This work addresses the challenge of garment unfolding and standardization in robotic manipulation, where complex deformations and self-occlusion hinder reliable perception and control. To this end, we propose APS-Netβa unified framework that jointly models garment unfolding and standardization for the first time. Methodologically, we design a decomposed reward function integrating coverage ratio, keypoint distance error, and IoU; introduce spatial action masking and an action optimization module; and employ a dual-arm, multi-primitive strategy (dynamic swinging + grasp-and-place) enabled by vision-based closed-loop deep reinforcement learning. Experiments demonstrate significant improvements in simulation: +3.9% coverage ratio, +5.2% IoU, and β7.09% keypoint distance error for long-sleeve garments. Real-world validation confirms that standardized garment layouts substantially simplify subsequent folding tasks. APS-Net provides a generalizable, end-to-end solution for dexterous manipulation of highly deformable objects.
π Abstract
Garment manipulation is a significant challenge for robots due to the complex dynamics and potential self-occlusion of garments. Most existing methods of efficient garment unfolding overlook the crucial role of standardization of flattened garments, which could significantly simplify downstream tasks like folding, ironing, and packing. This paper presents APS-Net, a novel approach to garment manipulation that combines unfolding and standardization in a unified framework. APS-Net employs a dual-arm, multi-primitive policy with dynamic fling to quickly unfold crumpled garments and pick-and-place (p and p) for precise alignment. The purpose of garment standardization during unfolding involves not only maximizing surface coverage but also aligning the garment's shape and orientation to predefined requirements. To guide effective robot learning, we introduce a novel factorized reward function for standardization, which incorporates garment coverage (Cov), keypoint distance (KD), and intersection-over-union (IoU) metrics. Additionally, we introduce a spatial action mask and an Action Optimized Module to improve unfolding efficiency by selecting actions and operation points effectively. In simulation, APS-Net outperforms state-of-the-art methods for long sleeves, achieving 3.9 percent better coverage, 5.2 percent higher IoU, and a 0.14 decrease in KD (7.09 percent relative reduction). Real-world folding tasks further demonstrate that standardization simplifies the folding process. Project page: see https://hellohaia.github.io/APS/