Speech Command + Speech Emotion: Exploring Emotional Speech Commands as a Compound and Playful Modality

📅 2025-04-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the lack of affective dimension in voice-based interaction, which undermines the sociality of intelligent agents. To bridge this gap, we propose “affective voice commands”—a composite modality jointly modeling semantic content and prosodic affect (e.g., intonation, rhythm) of spoken instructions. We implement a real-time affect-aware agent prototype and evaluate it in a retro-style dual-vehicle control game (N=14). Using voice emotion analysis, real-time behavioral adaptation, and Likert-scale assessments, we first empirically identify users’ conscious “vocal performance” during interaction—i.e., deliberate modulation of vocal expression to convey intent or emotion. Results show that affect-adaptive agents significantly enhance perceived stimulation (p<0.01) and reliance (p<0.05). These findings substantiate the critical role of affective responsiveness in improving interactional credibility and social presence.

Technology Category

Application Category

📝 Abstract
In an era of human-computer interaction with increasingly agentic AI systems capable of connecting with users conversationally, speech is an important modality for commanding agents. By recognizing and using speech emotions (i.e., how a command is spoken), we can provide agents with the ability to emotionally accentuate their responses and socially enrich users' perceptions and experiences. To explore the concept and impact of speech emotion commands on user perceptions, we realized a prototype and conducted a user study (N = 14) where speech commands are used to steer two vehicles in a minimalist and retro game style implementation. While both agents execute user commands, only one of the agents uses speech emotion information to adapt its execution behavior. We report on differences in how users perceived each agent, including significant differences in stimulation and dependability, outline implications for designing interactions with agents using emotional speech commands, and provide insights on how users consciously emote, which we describe as"voice acting".
Problem

Research questions and friction points this paper is trying to address.

Exploring emotional speech commands for human-computer interaction
Investigating impact of speech emotion on user perceptions
Designing agents that adapt behavior using emotional speech
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining speech commands with emotional tones
Prototype using emotion-adaptive agent behavior
User study on perception of emotional agents
🔎 Similar Papers
No similar papers found.
Ilhan Aslan
Ilhan Aslan
Associate Professor, Aalborg University
Intelligent User InterfacesHuman-Computer InteractionHuman-Centered AI
T
Timothy Merritt
Aalborg University, Denmark
S
Stine S. Johansen
Aalborg University, Denmark
N
N. V. Berkel
Aalborg University, Denmark