TwiUSD: A Benchmark Dataset and Structure-Aware LLM Framework for User Stance Detection

📅 2025-06-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
User Stance Detection (UserSD) has long suffered from the absence of high-quality benchmark datasets that jointly capture linguistic and social structural signals. To address this, we introduce TwiUSD—the first large-scale, manually annotated UserSD benchmark—comprising 16,211 users and 47,757 tweets with explicit follower-followee relationships, enabling the first explicit modeling of social attention graphs. To tackle the heterogeneity between textual and graph-structured data, we propose MRFG, a structure-aware framework integrating LLM-driven relevance filtering with topology-aware multi-scale feature routing, which adaptively coordinates GNNs and MLPs for joint representation learning. Extensive experiments demonstrate that MRFG significantly outperforms state-of-the-art PLMs, graph-based models, and LLM prompting methods under both in-target and cross-target evaluation settings. Our results empirically validate that explicit modeling of social graph structure is critical for robust UserSD.

Technology Category

Application Category

📝 Abstract
User-level stance detection (UserSD) remains challenging due to the lack of high-quality benchmarks that jointly capture linguistic and social structure. In this paper, we introduce TwiUSD, the first large-scale, manually annotated UserSD benchmark with explicit followee relationships, containing 16,211 users and 47,757 tweets. TwiUSD enables rigorous evaluation of stance models by integrating tweet content and social links, with superior scale and annotation quality. Building on this resource, we propose MRFG: a structure-aware framework that uses LLM-based relevance filtering and feature routing to address noise and context heterogeneity. MRFG employs multi-scale filtering and adaptively routes features through graph neural networks or multi-layer perceptrons based on topological informativeness. Experiments show MRFG consistently outperforms strong baselines (including PLMs, graph-based models, and LLM prompting) in both in-target and cross-target evaluation.
Problem

Research questions and friction points this paper is trying to address.

Lack of high-quality benchmarks for user stance detection
Noise and context heterogeneity in stance detection models
Need for structure-aware frameworks integrating social links
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale annotated dataset with social links
LLM-based relevance filtering for noise reduction
Adaptive feature routing via graph networks
🔎 Similar Papers
No similar papers found.
F
Fuaing Niu
The College of Big Data and Internet, Shenzhen Technology University, China
Z
Zini Chen
The College of Big Data and Internet, Shenzhen Technology University, China
Z
Zhiyu Xie
The College of Big Data and Internet, Shenzhen Technology University, China
Genan Dai
Genan Dai
Shenzhen Technology University
Spatio-temporal Data Mining
B
Bowen Zhang
The College of Big Data and Internet, Shenzhen Technology University, China