Color-based Emotion Representation for Speech Emotion Recognition

📅 2026-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limitations of traditional speech emotion recognition approaches that rely on discrete categorical or dimensional labels, which often fail to capture both the richness and interpretability of emotional expression. To overcome this, the work introduces color attributes—hue, saturation, and brightness—as a continuous and intuitive representation of emotion. A crowdsourced dataset annotated with these color-based labels is constructed, and a multi-task learning framework is proposed to jointly optimize color attribute regression and emotion classification. Experimental results demonstrate a strong correlation between color attributes and vocal emotion, validating the effectiveness of the regression model. Furthermore, the multi-task learning strategy significantly enhances performance across individual subtasks, improving both model interpretability and generalization capability.

Technology Category

Application Category

📝 Abstract
Speech emotion recognition (SER) has traditionally relied on categorical or dimensional labels. However, this technique is limited in representing both the diversity and interpretability of emotions. To overcome this limitation, we focus on color attributes, such as hue, saturation, and value, to represent emotions as continuous and interpretable scores. We annotated an emotional speech corpus with color attributes via crowdsourcing and analyzed them. Moreover, we built regression models for color attributes in SER using machine learning and deep learning, and explored the multitask learning of color attribute regression and emotion classification. As a result, we demonstrated the relationship between color attributes and emotions in speech, and successfully developed color attribute regression models for SER. We also showed that multitask learning improved the performance of each task.
Problem

Research questions and friction points this paper is trying to address.

Speech Emotion Recognition
Emotion Representation
Color Attributes
Emotion Diversity
Emotion Interpretability
Innovation

Methods, ideas, or system contributions that make the work stand out.

color-based emotion representation
speech emotion recognition
multitask learning
emotion regression
crowdsourced annotation
🔎 Similar Papers
No similar papers found.
R
Ryotaro Nagase
College of Information Science and Engineering, Ritsumeikan University, Osaka, Japan
Ryoichi Takashima
Ryoichi Takashima
Ritsumeikan University
Machine learningStatistical signal processingSpeech processing
Y
Yoichi Yamashita
College of Information Science and Engineering, Ritsumeikan University, Osaka, Japan