🤖 AI Summary
Addressing the challenges of identifying student misconceptions and translating them into effective pedagogical actions in large-scale online discussion forums, this paper proposes the “Misconception-to-Mastery” (M2M) framework. M2M integrates large language models (LLMs) with retrieval-augmented generation (RAG) to automatically detect high-frequency and deep-seated misconceptions from voluminous discussion posts and generate actionable instructional interventions. Evaluated on real-world data from three computer science courses (1,355 students; 2,878 posts), M2M significantly improves misconception detection accuracy and instructional response efficiency. Instructor evaluations confirm that the framework not only precisely pinpoints cognitive biases but also directly converts analytical outputs into concrete teaching actions—including lecture content, exercise design, and feedback phrasing—enabling a data-driven, closed-loop adaptive instruction pipeline. This work establishes, for the first time, an end-to-end AI-supported paradigm bridging misconception detection to classroom implementation.
📝 Abstract
In the contemporary educational landscape, particularly in large classroom settings, discussion forums have become a crucial tool for promoting interaction and addressing student queries. These forums foster a collaborative learning environment where students engage with both the teaching team and their peers. However, the sheer volume of content generated in these forums poses two significant interconnected challenges: How can we effectively identify common misunderstandings that arise in student discussions? And once identified, how can instructors use these insights to address them effectively? This paper explores the approach to integrating large language models (LLMs) and Retrieval-Augmented Generation (RAG) to tackle these challenges. We then demonstrate the approach Misunderstanding to Mastery (M2M) with authentic data from three computer science courses, involving 1355 students with 2878 unique posts, followed by an evaluation with five instructors teaching these courses. Results show that instructors found the approach promising and valuable for teaching, effectively identifying misunderstandings and generating actionable insights. Instructors highlighted the need for more fine-grained groupings, clearer metrics, validation of the created resources, and ethical considerations around data anonymity.