🤖 AI Summary
This study addresses argument mining from online comments on polarized public controversies (e.g., abortion), proposing a three-stage predefined argument mining framework: argument existence detection, span extraction, and logical relation classification. We conduct the first systematic evaluation of four state-of-the-art large language models (LLMs)—including fine-tuned variants—on multi-topic, few-shot, high-affectivity argument understanding. Validated on a dataset comprising 2,000+ comments across six polarized topics, our framework achieves strong overall performance. Concurrently, we uncover systematic LLM deficiencies in long-text reasoning and emotionally charged language processing, and quantify their environmental footprint. Key contributions include: (1) a reproducible, end-to-end argument mining pipeline; (2) a cross-topic benchmark for argument understanding under realistic social media conditions; and (3) identification of critical bottlenecks limiting current LLMs’ applicability to public opinion analysis tasks.
📝 Abstract
Automated large-scale analysis of public discussions around contested issues like abortion requires detecting and understanding the use of arguments. While Large Language Models (LLMs) have shown promise in language processing tasks, their performance in mining topic-specific, pre-defined arguments in online comments remains underexplored. We evaluate four state-of-the-art LLMs on three argument mining tasks using datasets comprising over 2,000 opinion comments across six polarizing topics. Quantitative evaluation suggests an overall strong performance across the three tasks, especially for large and fine-tuned LLMs, albeit at a significant environmental cost. However, a detailed error analysis revealed systematic shortcomings on long and nuanced comments and emotionally charged language, raising concerns for downstream applications like content moderation or opinion analysis. Our results highlight both the promise and current limitations of LLMs for automated argument analysis in online comments.