C-MTCSD: A Chinese Multi-Turn Conversational Stance Detection Dataset

πŸ“… 2025-04-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Chinese multi-turn dialogue stance detection faces challenges including difficulty in identifying implicit stances and scarcity of annotated data. To address these, we introduce C-MTCSD, the first large-scale benchmark dataset for this task, comprising 24,264 finely annotated Weibo multi-turn dialoguesβ€”4.2Γ— larger than prior Chinese datasets. We propose a novel annotation paradigm integrating expert-designed rule verification with crowdsourced collaboration to ensure high-quality labels. We systematically evaluate SVM, BERT, ChatGLM, and Qwen under both supervised and zero-shot settings. Results reveal that dialogue depth and stance implicitness severely degrade model performance: F1 scores consistently decline in deeper dialogues, and implicit stance detection achieves <50% F1 across all models. Under zero-shot evaluation, the best-performing model attains only 64.07% F1, establishing a new baseline. This work provides a critical data resource and methodological insights for Chinese stance analysis.

Technology Category

Application Category

πŸ“ Abstract
Stance detection has become an essential tool for analyzing public discussions on social media. Current methods face significant challenges, particularly in Chinese language processing and multi-turn conversational analysis. To address these limitations, we introduce C-MTCSD, the largest Chinese multi-turn conversational stance detection dataset, comprising 24,264 carefully annotated instances from Sina Weibo, which is 4.2 times larger than the only prior Chinese conversational stance detection dataset. Our comprehensive evaluation using both traditional approaches and large language models reveals the complexity of C-MTCSD: even state-of-the-art models achieve only 64.07% F1 score in the challenging zero-shot setting, while performance consistently degrades with increasing conversation depth. Traditional models particularly struggle with implicit stance detection, achieving below 50% F1 score. This work establishes a challenging new benchmark for Chinese stance detection research, highlighting significant opportunities for future improvements.
Problem

Research questions and friction points this paper is trying to address.

Detecting stance in Chinese multi-turn conversations
Addressing limitations in Chinese language processing
Improving performance in implicit stance detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Largest Chinese multi-turn stance dataset
Evaluates traditional and LLM approaches
Highlights implicit stance detection challenges
πŸ”Ž Similar Papers
F
Fuqiang Niu
Shenzhen Technology University, Shenzhen, China
Y
Yi Yang
Shenzhen Technology University, Shenzhen, China
Xianghua Fu
Xianghua Fu
Shenzhen Technology University
Machine LearningNatural Language Processing
Genan Dai
Genan Dai
Shenzhen Technology University
Spatio-temporal Data Mining
B
Bowen Zhang
Shenzhen Technology University, Shenzhen, China