EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Chinese sentiment datasets suffer from poor linguistic–cultural adaptability, unimodal representation, and coarse-grained annotations, hindering high-quality multimodal sentiment analysis. To address these limitations, we introduce CMED—the first high-quality, dialogue-oriented, multimodal sentiment dataset tailored to the Chinese context. CMED comprises natural dyadic conversations involving 19 actors, with 23.6 hours of synchronized audiovisual recordings and corresponding transcripts. Annotations are fine-grained, covering seven emotion categories, five-dimensional sentiment polarity, and four-dimensional prosodic attributes. We propose a standardized pipeline for multimodal acquisition, cross-modal alignment, and data cleaning, ensuring speaker- and modality-consistent labeling. The dataset is publicly released with 19,250 samples. Empirical evaluation demonstrates substantial improvements in generalization performance on both unimodal and multimodal sentiment recognition tasks, establishing a new benchmark for cross-cultural affective modeling, missing-modality imputation, and speech–semantics joint analysis.

Technology Category

Application Category

📝 Abstract
In recent years, emotion recognition plays a critical role in applications such as human-computer interaction, mental health monitoring, and sentiment analysis. While datasets for emotion analysis in languages such as English have proliferated, there remains a pressing need for high-quality, comprehensive datasets tailored to the unique linguistic, cultural, and multimodal characteristics of Chinese. In this work, we propose extbf{EmotionTalk}, an interactive Chinese multimodal emotion dataset with rich annotations. This dataset provides multimodal information from 19 actors participating in dyadic conversational settings, incorporating acoustic, visual, and textual modalities. It includes 23.6 hours of speech (19,250 utterances), annotations for 7 utterance-level emotion categories (happy, surprise, sad, disgust, anger, fear, and neutral), 5-dimensional sentiment labels (negative, weakly negative, neutral, weakly positive, and positive) and 4-dimensional speech captions (speaker, speaking style, emotion and overall). The dataset is well-suited for research on unimodal and multimodal emotion recognition, missing modality challenges, and speech captioning tasks. To our knowledge, it represents the first high-quality and versatile Chinese dialogue multimodal emotion dataset, which is a valuable contribution to research on cross-cultural emotion analysis and recognition. Additionally, we conduct experiments on EmotionTalk to demonstrate the effectiveness and quality of the dataset. It will be open-source and freely available for all academic purposes. The dataset and codes will be made available at: https://github.com/NKU-HLT/EmotionTalk.
Problem

Research questions and friction points this paper is trying to address.

Lack of high-quality Chinese multimodal emotion datasets
Need for culturally tailored emotion recognition resources
Addressing missing modality challenges in emotion analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Chinese emotion dataset with rich annotations
Includes acoustic, visual, and textual modalities
Supports unimodal and multimodal emotion recognition
🔎 Similar Papers
No similar papers found.
Haoqin Sun
Haoqin Sun
Nankai University
Affective computingSpeech signal processingAudio understanding
X
Xuechen Wang
Nankai University
Jinghua Zhao
Jinghua Zhao
Nankai University
Shiwan Zhao
Shiwan Zhao
Independent Researcher, Research Scientist of IBM Research - China (2000-2020)
AGILarge Language ModelNLPSpeechRecommeder System
J
Jiaming Zhou
Nankai University
H
Hui Wang
Nankai University
J
Jiabei He
Nankai University
Aobo Kong
Aobo Kong
Nankai University
NLPLLM
X
Xi Yang
Beijing Academy of Artificial Intelligence
Y
Yequan Wang
Beijing Academy of Artificial Intelligence
Y
Yonghua Lin
Beijing Academy of Artificial Intelligence
Y
Yong Qin
Nankai University