Close-Fitting Dressing Assistance Based on State Estimation of Feet and Garments with Semantic-based Visual Attention

📅 2025-05-06

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

To address the critical shortage of caregiving personnel in aging societies, this study proposes a robot-assisted autonomous method for donning tight-fitting garments—specifically socks. To tackle the complexities of foot-sock interaction—including high friction, large deformations, and precise pose estimation—the method integrates RGB-D vision, joint-state feedback, and six-axis tactile force sensing into a unified multimodal perception framework. It introduces a novel semantic-driven visual attention mechanism that replaces pixel-level representations with object-level concepts, significantly enhancing cross-domain generalization. Furthermore, it employs deep spatial modeling to explicitly encode geometric relationships between sock and foot, and achieves zero-shot transfer from anthropomorphic manikin data to real human subjects while ensuring safety. Evaluated on ten unseen human participants, the method achieves substantially higher sock-donning success rates than baseline approaches—Action Chunking with Transformer and Diffusion Policy—validating the effectiveness of its state-space modeling and adaptive force-control strategy.

Technology Category

Application Category

📝 Abstract

As the population continues to age, a shortage of caregivers is expected in the future. Dressing assistance, in particular, is crucial for opportunities for social participation. Especially dressing close-fitting garments, such as socks, remains challenging due to the need for fine force adjustments to handle the friction or snagging against the skin, while considering the shape and position of the garment. This study introduces a method uses multi-modal information including not only robot's camera images, joint angles, joint torques, but also tactile forces for proper force interaction that can adapt to individual differences in humans. Furthermore, by introducing semantic information based on object concepts, rather than relying solely on RGB data, it can be generalized to unseen feet and background. In addition, incorporating depth data helps infer relative spatial relationship between the sock and the foot. To validate its capability for semantic object conceptualization and to ensure safety, training data were collected using a mannequin, and subsequent experiments were conducted with human subjects. In experiments, the robot successfully adapted to previously unseen human feet and was able to put socks on 10 participants, achieving a higher success rate than Action Chunking with Transformer and Diffusion Policy. These results demonstrate that the proposed model can estimate the state of both the garment and the foot, enabling precise dressing assistance for close-fitting garments.

Problem

Research questions and friction points this paper is trying to address.

Develops robotic dressing assistance for close-fitting garments like socks

Uses multi-modal data to adapt to individual human differences

Improves success rate with semantic-based visual attention

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multi-modal data for adaptive force interaction

Incorporates semantic information for generalization

Employs depth data for spatial relationship inference

🔎 Similar Papers

General-purpose Clothes Manipulation with Semantic Keypoints