Conveying Emotions to Robots through Touch and Sound

📅 2024-12-04
🏛️ ICSR + AI
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the reliability and interpretability of tactile–auditory bimodal emotional expression in human–robot interaction. To this end, we synchronously recorded spontaneous tactile gestures and accompanying vocalizations using piezoresistive pressure sensors and a microphone array, constructing a cross-subject multimodal dataset comprising 10 emotion categories. We first empirically demonstrate significant inter-subject consistency in tactile emotional expression—a novel finding. We then propose a tactile–auditory fusion feature modeling approach, revealing that emotion confusion is predominantly governed by similarity in arousal and valence dimensions. Using SVM classification, the average accuracy across all 10 emotions reaches 40%, with “Attention” achieving 87.65%. Our core contributions are threefold: (1) establishing the cross-subject robustness of tactile affective expression; (2) introducing the first tactile–auditory collaborative framework for emotion recognition; and (3) elucidating the cognitive origins—namely, low discriminability arising from overlapping affective dimensions—underlying poorly differentiated emotions.

Technology Category

Application Category

📝 Abstract
Human emotions can be conveyed through nuanced touch gestures. However, there is a lack of understanding of how consistently emotions can be conveyed to robots through touch. This study explores the consistency of touch-based emotional expression toward a robot by integrating tactile and auditory sensory reading of affective haptic expressions. We developed a piezoresistive pressure sensor and used a microphone to mimic touch and sound channels, respectively. In a study with 28 participants, each conveyed 10 emotions to a robot using spontaneous touch gestures. Our findings reveal a statistically significant consistency in emotion expression among participants. However, some emotions obtained low intraclass correlation values. Additionally, certain emotions with similar levels of arousal or valence did not exhibit significant differences in the way they were conveyed. We subsequently constructed a multi-modal integrating touch and audio features to decode the 10 emotions. A support vector machine (SVM) model demonstrated the highest accuracy, achieving 40% for 10 classes, with"Attention"being the most accurately conveyed emotion at a balanced accuracy of 87.65%.
Problem

Research questions and friction points this paper is trying to address.

How reliably robots interpret human emotions via touch and sound
Consistency and distinguishability of emotional touch gestures for robots
Multimodal models improve emotion and gesture decoding accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal integration of touch and sound
Custom piezoresistive pressure sensor usage
CNN-LSTM for high gesture classification accuracy
🔎 Similar Papers
No similar papers found.