LASTIST: LArge-Scale Target-Independent STance dataset

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing stance detection research predominantly focuses on target-dependent tasks and relies heavily on English-centric benchmark datasets, severely hindering model development for low-resource languages such as Korean. To address this gap, we introduce LASTIST—the first large-scale, target-independent Korean stance detection dataset—comprising 563,299 annotated sentences extracted from official press releases of South Korean political parties. LASTIST supports both target-agnostic stance identification and diachronic stance evolution analysis. High-quality annotation is ensured through a pipeline integrating deep learning–assisted labeling, NLP-driven data cleaning, and cross-source alignment. The dataset is publicly released and rigorously evaluated across multiple benchmark tasks. Experimental results demonstrate substantial improvements in modeling, training, and evaluation capabilities for stance detection in low-resource languages, effectively filling the critical data void for non-English stance detection research.

Technology Category

Application Category

📝 Abstract
Stance detection has emerged as an area of research in the field of artificial intelligence. However, most research is currently centered on the target-dependent stance detection task, which is based on a person's stance in favor of or against a specific target. Furthermore, most benchmark datasets are based on English, making it difficult to develop models in low-resource languages such as Korean, especially for an emerging field such as stance detection. This study proposes the LArge-Scale Target-Independent STance (LASTIST) dataset to fill this research gap. Collected from the press releases of both parties on Korean political parties, the LASTIST dataset uses 563,299 labeled Korean sentences. We provide a detailed description of how we collected and constructed the dataset and trained state-of-the-art deep learning and stance detection models. Our LASTIST dataset is designed for various tasks in stance detection, including target-independent stance detection and diachronic evolution stance detection. We deploy our dataset on https://anonymous.4open.science/r/LASTIST-3721/.
Problem

Research questions and friction points this paper is trying to address.

Addressing the lack of target-independent stance detection datasets in AI research
Providing Korean language resources for low-resource stance detection development
Enabling diachronic evolution analysis and target-independent stance classification tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale Korean political stance dataset
Target-independent stance detection framework
Deep learning models for multilingual stance analysis
🔎 Similar Papers
No similar papers found.