🤖 AI Summary
This study addresses the scalability limitations of traditional van Hiele geometric reasoning level assessments, which rely on expert manual analysis of open-ended responses. To overcome this challenge, the authors construct a theory-informed structured dictionary encompassing 33 fine-grained geometric reasoning skills and propose a novel, theory-driven framework for automatically diagnosing pre-service teachers’ van Hiele levels from open-ended answers. The framework explicitly incorporates skill-level information by integrating retrieval-augmented generation (RAG) with multi-task learning (MTL) to enable skill-aware classification. Experimental results on 226 teacher responses demonstrate that the proposed approach significantly outperforms baseline methods lacking explicit skill information, offering an effective and scalable solution for large-scale assessment of geometric content knowledge.
📝 Abstract
Assessing teachers' geometric content knowledge is essential for geometry instructional quality and student learning, but difficult to scale. The Van Hiele model characterizes geometric reasoning through five hierarchical levels. Traditional Van Hiele assessment relies on manual expert analysis of open-ended responses. This process is time-consuming, costly, and prevents large-scale evaluation. This study develops an automated approach for diagnosing teachers' Van Hiele reasoning levels using large language models grounded in educational theory. Our central hypothesis is that integrating explicit skills information significantly improves Van Hiele classification. In collaboration with mathematics education researchers, we built a structured skills dictionary decomposing the Van Hiele levels into 33 fine-grained reasoning skills. Through a custom web platform, 31 pre-service teachers solved geometry problems, yielding 226 responses. Expert researchers then annotated each response with its Van Hiele level and demonstrated skills from the dictionary. Using this annotated dataset, we implemented two classification approaches: (1) retrieval-augmented generation (RAG) and (2) multi-task learning (MTL). Each approach compared a skills-aware variant incorporating the skills dictionary against a baseline without skills information. Results showed that for both methods, skills-aware variants significantly outperformed baselines across multiple evaluation metrics. This work provides the first automated approach for Van Hiele level classification from open-ended responses. It offers a scalable, theory-grounded method for assessing teachers' geometric reasoning that can enable large-scale evaluation and support adaptive, personalized teacher learning systems.