🤖 AI Summary
This work addresses the critical bimanual coordination skill of “self-handover”—a seamless transfer of an object between a person’s own hands—whose structural understanding remains underexplored despite its importance in complex manipulation tasks such as cooking. To bridge this gap, we introduce the first systematic taxonomy of self-handover, manually annotated from over 12 hours of real-world cooking videos involving 21 participants. We propose a forward-looking dual-hand coordination model that challenges the conventional view of self-handover as merely passive transition, instead framing it as an active, anticipatory process. Integrating multimodal behavioral analysis with vision-language models (VLMs), our approach achieves high-accuracy recognition of self-handover action types. The resulting scalable taxonomy and automated recognition paradigm provide both theoretical foundations and practical tools for dexterous bimanual manipulation in robotics.
📝 Abstract
Self-handover, transferring an object between one's own hands, is a common but understudied bimanual action. While it facilitates seamless transitions in complex tasks, the strategies underlying its execution remain largely unexplored. Here, we introduce the first systematic taxonomy of self-handover, derived from manual annotation of over 12 hours of cooking activity performed by 21 participants. Our analysis reveals that self-handover is not merely a passive transition, but a highly coordinated action involving anticipatory adjustments by both hands. As a step toward automated analysis of human manipulation, we further demonstrate the feasibility of classifying self-handover types using a state-of-the-art vision-language model. These findings offer fresh insights into bimanual coordination, underscoring the role of self-handover in enabling smooth task transitions-an ability essential for adaptive dual-arm robotics.