🤖 AI Summary
This work addresses the challenge of real-time recognition and semantic interpretation of tactile gestures (e.g., poke, grasp, stroke, double-tap) for vision- and speech-free human–robot interaction in service robotics. We propose a modular whole-body volumetric electronic skin architecture integrating distributed flexible pressure arrays with pose-sensing units. To map high-density, irregularly distributed spatiotemporal tactile signals directly to gesture semantics, we introduce an equivariant graph neural network (Equivariant GNN)—the first such model for end-to-end tactile gesture parsing. The system achieves high classification accuracy across multiple gesture classes and successfully triggers corresponding robot actions in real time. Key contributions include: (1) a scalable, modular hardware architecture for whole-body tactile perception; (2) an equivariant GNN model preserving spatial symmetries inherent in body-mounted sensor layouts; and (3) empirical validation of fully tactile, vision- and speech-independent human–robot interaction in realistic service scenarios.
📝 Abstract
With the development of robot electronic skin technology, various tactile sensors, enhanced by AI, are unlocking a new dimension of perception for robots. In this work, we explore how robots equipped with electronic skin can recognize tactile gestures and interpret them as human commands. We developed a modular robot E-skin, composed of multiple irregularly shaped skin patches, which can be assembled to cover the robot's body while capturing real-time pressure and pose data from thousands of sensing points. To process this information, we propose an equivariant graph neural network-based recognizer that efficiently and accurately classifies diverse tactile gestures, including poke, grab, stroke, and double-pat. By mapping the recognized gestures to predefined robot actions, we enable intuitive human-robot interaction purely through tactile input.