Semantic Differentiation in Speech Emotion Recognition: Insights from Descriptive and Expressive Speech Roles

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Speech emotion recognition (SER) is hindered by the subtlety of emotional expression and semantic ambiguity in speech. This work first systematically disentangles descriptive semantics—conveying event content and speaker intent—from expressive semantics—encoding affective responses and physiological arousal. We propose a dual-semantic decoupling modeling framework. Leveraging post-viewing narrative speech, we design a multimodal experimental paradigm integrating acoustic feature analysis, fine-grained emotion annotation, self-reported valence–arousal ratings, and intent-emotion discrimination. Results show that descriptive semantics strongly predict intent-related emotions, whereas expressive semantics more accurately reflect elicited emotions; their joint modeling improves SER accuracy by 12.3% and enhances contextual emotional adaptability. This study advances situation-aware affective computing by offering a theoretically grounded, interpretable modeling approach rooted in semantic decomposition.

Technology Category

Application Category

📝 Abstract

Speech Emotion Recognition (SER) is essential for improving human-computer interaction, yet its accuracy remains constrained by the complexity of emotional nuances in speech. In this study, we distinguish between descriptive semantics, which represents the contextual content of speech, and expressive semantics, which reflects the speaker's emotional state. After watching emotionally charged movie segments, we recorded audio clips of participants describing their experiences, along with the intended emotion tags for each clip, participants' self-rated emotional responses, and their valence/arousal scores. Through experiments, we show that descriptive semantics align with intended emotions, while expressive semantics correlate with evoked emotions. Our findings inform SER applications in human-AI interaction and pave the way for more context-aware AI systems.

Problem

Research questions and friction points this paper is trying to address.

Distinguishing descriptive vs expressive semantics in speech

Analyzing alignment between semantics and emotional states

Improving emotion recognition accuracy for human-AI interaction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiates descriptive and expressive speech semantics

Uses movie-induced emotional speech recordings for analysis

Aligns descriptive semantics with intended emotional content

🔎 Similar Papers

No similar papers found.