Leveraging Small LLMs for Argument Mining in Education: Argument Component Identification, Classification, and Assessment

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses argument mining in educational settings—specifically, identifying, classifying, and evaluating arguments in student persuasive essays. We propose a lightweight framework based on small open-source decoder-only large language models (e.g., Phi-3, TinyLlama). Methodologically, it integrates few-shot prompting with supervised fine-tuning, leveraging sequence labeling for argument segmentation and text classification for type identification and quality assessment—enabling localized, low-overhead, privacy-preserving real-time feedback. Our key contribution is the first systematic investigation of multi-task adaptability of compact LLMs for educational argument mining, moving beyond traditional encoder-based architectures while balancing deployability and performance. Experiments show that fine-tuned models significantly outperform baselines on the Feedback Prize dataset (argument segmentation F1 +8.2%, type classification accuracy +6.5%); few-shot prompting achieves baseline-level performance in quality assessment, validating the efficacy of the lightweight paradigm.

Technology Category

Application Category

📝 Abstract
Argument mining algorithms analyze the argumentative structure of essays, making them a valuable tool for enhancing education by providing targeted feedback on the students' argumentation skills. While current methods often use encoder or encoder-decoder deep learning architectures, decoder-only models remain largely unexplored, offering a promising research direction. This paper proposes leveraging open-source, small Large Language Models (LLMs) for argument mining through few-shot prompting and fine-tuning. These models' small size and open-source nature ensure accessibility, privacy, and computational efficiency, enabling schools and educators to adopt and deploy them locally. Specifically, we perform three tasks: segmentation of student essays into arguments, classification of the arguments by type, and assessment of their quality. We empirically evaluate the models on the Feedback Prize - Predicting Effective Arguments dataset of grade 6-12 students essays and demonstrate how fine-tuned small LLMs outperform baseline methods in segmenting the essays and determining the argument types while few-shot prompting yields comparable performance to that of the baselines in assessing quality. This work highlights the educational potential of small, open-source LLMs to provide real-time, personalized feedback, enhancing independent learning and writing skills while ensuring low computational cost and privacy.
Problem

Research questions and friction points this paper is trying to address.

Leverage small LLMs for argument mining in education.
Perform argument segmentation, classification, and quality assessment.
Ensure accessibility, privacy, and computational efficiency.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Few-shot prompting for argument assessment
Fine-tuning small LLMs for segmentation
Open-source LLMs ensure privacy and efficiency
🔎 Similar Papers
No similar papers found.