ArguAgent: AI-Supported Real-Time Grouping for Productive Argumentation in STEM Classrooms

📅 2026-04-25

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This study addresses the challenge in STEM classrooms where argumentation is often dominated by high-achieving students, marginalizing low-achieving peers and hindering teachers’ ability to assess students’ positions and argument quality in real time for effective grouping. To tackle this, the authors propose ArguAgent, a novel system grounded in learning progression theory that dynamically forms intelligent groups by integrating semantic heterogeneity of stances with constraints on argument quality. The system employs a two-stage evaluation: calibrated prompts guide large language models (GPT-4o-mini/GPT-5.x) to score arguments on a 0–4 scale, while semantic clustering analyzes stance diversity. Evaluation shows strong alignment with expert judgments (Krippendorff’s α = 0.817; QWK = 0.708). Simulations indicate that 95.4% of ArguAgent-generated groups meet design criteria—3.2 times higher than random grouping—significantly enhancing inclusive, evidence-based classroom discourse.

Technology Category

Application Category

📝 Abstract

Argumentation is a core practice in STEM education, but its productivity depends on who participates and how they interact. Higher-achieving students often dominate the talk and decision-making, while lower-achieving peers may disengage, defer, or comply without contributing substantive reasoning. Forming groups strategically based on students' stances and argumentation skills could help foster inclusive, evidence-based discourse. In practice, however, teachers are constrained in implementing this grouping strategy because it requires real-time insight into students' positions and the quality of their argumentation, information that is difficult to assess reliably and at scale during instruction. We present a generative AI-powered system, ArguAgent, that creates groups optimizing for stance heterogeneity while constraining argumentation quality differences to +/-1 level on a validated learning progression. ArguAgent uses a two-component assessment pipeline: first scoring student arguments on a 0-4 rubric, then clustering positions via semantic analysis. We validated the scoring component against human expert consensus (Krippendorff's ααα = 0.817) using 200 expert-generated scores. Testing three OpenAI models (GPT-4o-mini, GPT-5.1, GPT-5.2) with identical calibrated prompts, we found that systematic prompt engineering informed by human disagreement analysis contributed 89% of scoring improvement (QWK: 0.531 to 0.686), while model upgrades contributed an additional 11% (QWK: 0.686 to 0.708). Simulation testing across 100 classes demonstrated that the grouping algorithm achieves 95.4% of groups that meet both design criteria, a 3.2x improvement over random assignment. These results suggest ArguAgent can enable real-time, theoretically grounded grouping that promotes productive STEM argumentation in classrooms.

Problem

Research questions and friction points this paper is trying to address.

argumentation

grouping

STEM education

real-time assessment

inclusive discourse

Innovation

Methods, ideas, or system contributions that make the work stand out.

generative AI

real-time grouping

argumentation assessment