C-MTCSD: A Chinese Multi-Turn Conversational Stance Detection Dataset

📅 2025-04-14

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Chinese multi-turn dialogue stance detection faces challenges including difficulty in identifying implicit stances and scarcity of annotated data. To address these, we introduce C-MTCSD, the first large-scale benchmark dataset for this task, comprising 24,264 finely annotated Weibo multi-turn dialogues—4.2× larger than prior Chinese datasets. We propose a novel annotation paradigm integrating expert-designed rule verification with crowdsourced collaboration to ensure high-quality labels. We systematically evaluate SVM, BERT, ChatGLM, and Qwen under both supervised and zero-shot settings. Results reveal that dialogue depth and stance implicitness severely degrade model performance: F1 scores consistently decline in deeper dialogues, and implicit stance detection achieves <50% F1 across all models. Under zero-shot evaluation, the best-performing model attains only 64.07% F1, establishing a new baseline. This work provides a critical data resource and methodological insights for Chinese stance analysis.

Technology Category

Application Category

📝 Abstract

Stance detection has become an essential tool for analyzing public discussions on social media. Current methods face significant challenges, particularly in Chinese language processing and multi-turn conversational analysis. To address these limitations, we introduce C-MTCSD, the largest Chinese multi-turn conversational stance detection dataset, comprising 24,264 carefully annotated instances from Sina Weibo, which is 4.2 times larger than the only prior Chinese conversational stance detection dataset. Our comprehensive evaluation using both traditional approaches and large language models reveals the complexity of C-MTCSD: even state-of-the-art models achieve only 64.07% F1 score in the challenging zero-shot setting, while performance consistently degrades with increasing conversation depth. Traditional models particularly struggle with implicit stance detection, achieving below 50% F1 score. This work establishes a challenging new benchmark for Chinese stance detection research, highlighting significant opportunities for future improvements.

Problem

Research questions and friction points this paper is trying to address.

Detecting stance in Chinese multi-turn conversations

Addressing limitations in Chinese language processing

Improving performance in implicit stance detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Largest Chinese multi-turn stance dataset

Evaluates traditional and LLM approaches

Highlights implicit stance detection challenges

🔎 Similar Papers

USDC: A Dataset of User Stance and Dogmatism in Long Conversations