Toward LLM-Supported Automated Assessment of Critical Thinking Subskills

📅 2025-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of automating fine-grained assessment of critical thinking subskills in argumentative writing. We propose the first operational, granular coding rubric for these subskills and construct a manually annotated corpus. Methodologically, we empirically evaluate zero-shot prompting, few-shot prompting, and supervised fine-tuning across three models—GPT-5, GPT-5-mini, and ModernBERT. Our key contributions are threefold: (1) the first scalable, automated measurement framework for multidimensional critical thinking subskills; (2) empirical evidence that proprietary models (e.g., GPT-5) excel at recognizing high-frequency, well-defined subskills under few-shot settings, whereas open-weight models demonstrate superior performance on fine-grained and rare categories; and (3) validation of the feasibility—and identification of key bottlenecks—in automating higher-order thinking assessment. This work establishes a methodological benchmark and offers actionable insights for AI-driven educational assessment.

Technology Category

Application Category

📝 Abstract
Critical thinking represents a fundamental competency in today's education landscape. Developing critical thinking skills through timely assessment and feedback is crucial; however, there has not been extensive work in the learning analytics community on defining, measuring, and supporting critical thinking. In this paper, we investigate the feasibility of measuring core "subskills" that underlie critical thinking. We ground our work in an authentic task where students operationalize critical thinking: student-written argumentative essays. We developed a coding rubric based on an established skills progression and completed human coding for a corpus of student essays. We then evaluated three distinct approaches to automated scoring: zero-shot prompting, few-shot prompting, and supervised fine-tuning, implemented across three large language models (GPT-5, GPT-5-mini, and ModernBERT). GPT-5 with few-shot prompting achieved the strongest results and demonstrated particular strength on subskills with separable, frequent categories, while lower performance was observed for subskills that required detection of subtle distinctions or rare categories. Our results underscore critical trade-offs in automated critical thinking assessment: proprietary models offer superior reliability at higher cost, while open-source alternatives provide practical accuracy with reduced sensitivity to minority categories. Our work represents an initial step toward scalable assessment of higher-order reasoning skills across authentic educational contexts.
Problem

Research questions and friction points this paper is trying to address.

Automating assessment of critical thinking subskills in education
Evaluating LLM approaches for scoring argumentative student essays
Addressing trade-offs between reliability and cost in automated evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated scoring using large language models
Few-shot prompting for critical thinking assessment
Evaluating proprietary versus open-source model trade-offs
M
Marisa C. Peczuh
University of Minnesota
N
Nischal Ashok Kumar
University of Massachusetts Amherst
Ryan Baker
Ryan Baker
Adelaide University
Educational Data MiningLearning AnalyticsLearning EngineeringQuantitative Ethnography
Blair Lehman
Blair Lehman
Brighter Research
emotionmotivationlearningassessment
D
Danielle Eisenberg
Educational Testing Service (ETS)
Caitlin Mills
Caitlin Mills
Associate Professor, University of Minnesota
mind wanderingboredomaffectengagement
K
Keerthi Chebrolu
University of Massachusetts Amherst
S
Sudhip Nashi
University of Massachusetts Amherst
C
Cadence Young
University of Massachusetts Amherst
B
Brayden Liu
University of Massachusetts Amherst
S
Sherry Lachman
Advanced Education Research and Development Fund (AERDF)
A
Andrew Lan
University of Massachusetts Amherst