🤖 AI Summary
This study addresses a critical limitation in traditional stance detection, which compresses multidimensional attitudes into a single categorical label (support/oppose/neutral), leading to substantially reduced inter-annotator agreement when attitude dimensions conflict and thus failing to capture nuanced stances. The work introduces and empirically validates the “projection problem,” demonstrating that the forced single-label approach fails specifically under dimensional conflict, thereby revealing the conditional limitations of the prevailing ternary classification paradigm. To address this, the authors propose a dimension-decoupling analysis framework, comparing annotation reliability—measured by Krippendorff’s α—between holistic labels and fine-grained attitude dimensions (e.g., scientific facts, policy). Experiments on the SemEval-2016 Task 6 dataset show that when dimensions align, holistic labels achieve moderate reliability (α=0.307); however, under conflict, fine-grained annotations yield significantly higher reliability (up to α=0.572), while holistic label reliability plummets to α=0.085.
📝 Abstract
Stance detection is nearly always formulated as classifying text into Favor, Against, or Neutral -- a convention inherited from debate analysis and applied without modification to social media since SemEval-2016. But attitudes toward complex targets are not unitary: a person can accept climate science while opposing carbon taxes, expressing support on one dimension and opposition on another. When annotators must compress such multi-dimensional attitudes into a single label, different annotators weight different dimensions -- producing disagreement that reflects not confusion but different compression choices. We call this the \textbf{projection problem}, and show that its cost is conditional: when a text's dimensions align, any weighting yields the same label and three-way annotation works well; when dimensions conflict, label agreement collapses while agreement on individual dimensions remains intact. A pilot study on SemEval-2016 Task 6 confirms this crossover: on dimension-consistent texts, label agreement (Krippendorff's $α= 0.307$) exceeds dimensional agreement ($α= 0.082$); on dimension-conflicting texts, the pattern reverses -- label $α$ drops to $0.085$ while dimensional $α$ rises to $0.334$, with Policy reaching $0.572$. The projection problem is real -- but it activates precisely where it matters most.