🤖 AI Summary
Existing offensive language detection research suffers from reliance on outdated datasets and insufficient evaluation of generalization capability. To address this, this work focuses on contemporary Korean political discourse and introduces the first large-scale, newly annotated dataset for offensive language detection in this domain. We propose a three-paradigm pseudo-labeling framework that operates without ground-truth labels, integrating leave-one-out label consensus analysis with strategic single-shot prompting to enable lightweight, efficient modeling. Our approach innovatively uncovers systematic differences and aggregation patterns across distinct judgment paradigms. Empirical results demonstrate performance on par with high-resource supervised baselines, while achieving superior robustness and interpretability. This work establishes a reproducible, scalable paradigm for offensive language detection in low-resource, highly dynamic linguistic environments.
📝 Abstract
Although offensive language continually evolves over time, even recent studies using LLMs have predominantly relied on outdated datasets and rarely evaluated the generalization ability on unseen texts. In this study, we constructed a large-scale dataset of contemporary political discourse and employed three refined judgments in the absence of ground truth. Each judgment reflects a representative offensive language detection method and is carefully designed for optimal conditions. We identified distinct patterns for each judgment and demonstrated tendencies of label agreement using a leave-one-out strategy. By establishing pseudo-labels as ground trust for quantitative performance assessment, we observed that a strategically designed single prompting achieves comparable performance to more resource-intensive methods. This suggests a feasible approach applicable in real-world settings with inherent constraints.