Exploiting Information Theory for Intuitive Robot Programming of Manual Activities

πŸ“… 2024-10-31
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
How can robots understand and generalize manual manipulation skills from a single RGB video demonstration? Existing approaches rely on trajectory imitation, which suffers from poor generalization across environments. Method: We propose an end-to-end video-to-behavior-tree (BT) learning framework tailored for non-expert users. For the first time, we integrate Shannon information theory into manual task modeling, enabling scene-element identification and task-structure parsing via mutual information quantification. Scene graphs encode environmental context to support behavior segmentation, and executable BT policies are generated automatically. Contribution/Results: Our method requires only one human demonstration to produce robust, cross-scene robot policies. We release HANDSOME, a multi-agent manual skill dataset. Experiments demonstrate significant improvements over state-of-the-art baselines in task generalization, environmental adaptability, and execution success rate.

Technology Category

Application Category

πŸ“ Abstract
Observational learning is a promising approach to enable people without expertise in programming to transfer skills to robots in a user-friendly manner, since it mirrors how humans learn new behaviors by observing others. Many existing methods focus on instructing robots to mimic human trajectories, but motion-level strategies often pose challenges in skills generalization across diverse environments. This paper proposes a novel framework that allows robots to achieve a higher-level understanding of human-demonstrated manual tasks recorded in RGB videos. By recognizing the task structure and goals, robots generalize what observed to unseen scenarios. We found our task representation on Shannon's Information Theory (IT), which is applied for the first time to manual tasks. IT helps extract the active scene elements and quantify the information shared between hands and objects. We exploit scene graph properties to encode the extracted interaction features in a compact structure and segment the demonstration into blocks, streamlining the generation of Behavior Trees for robot replicas. Experiments validated the effectiveness of IT to automatically generate robot execution plans from a single human demonstration. Additionally, we provide HANDSOME, an open-source dataset of HAND Skills demOnstrated by Multi-subjEcts, to promote further research and evaluation in this field.
Problem

Research questions and friction points this paper is trying to address.

Robot Learning
Observational Learning
Manual Skill Acquisition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Shannon's Information Theory
Behavior Tree Generation
HANDSOME Dataset
πŸ”Ž Similar Papers
No similar papers found.
E
Elena Merlo
Human-Robot Interfaces and Interaction, Istituto Italiano di Tecnologia, Genoa, Italy and Dept. of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genoa, Genoa, Italy
M
Marta Lagomarsino
Human-Robot Interfaces and Interaction, Istituto Italiano di Tecnologia, Genoa, Italy
Edoardo Lamon
Edoardo Lamon
Assistant Professor at UniversitΓ  di Trento
Human-Robot TeamingHuman-Robot InteractionRobot Learning and ControlErgonomics
Arash Ajoudani
Arash Ajoudani
Tenured Senior Scientist, Istituto Italiano di Tecnologia
Collaborative RoboticsPhysical Human-Robot interactionHuman-Robot CollaborationAssistive RoboticsTelerobotics