🤖 AI Summary
Students with mathematics anxiety often struggle to comprehend higher-order mathematical problems, hindering conceptual understanding and metacognitive development. Method: We propose MEGA—a novel LLM-based pedagogical intervention framework integrating Socratic questioning, chain-of-thought (CoT) reasoning, lightweight gamification, and real-time formative feedback—designed to foster active engagement rather than passive exposition or standard CoT. Unlike conventional approaches, MEGA employs guided inquiry to stimulate self-directed reasoning, coupled with progress visualization and immediate feedback to close the learning loop. Contribution/Results: Experiments using GPT-4o and Claude 3.5 Sonnet on the MATH benchmark show that 47.5% of students rated MEGA’s explanations for challenging problems as substantially superior to standard CoT (26.67%). This validates MEGA’s efficacy in enhancing complex problem representation and metacognitive regulation. MEGA establishes a scalable, empirically evaluable paradigm for leveraging LLMs to support active, anxiety-mitigated mathematics learning.
📝 Abstract
This paper presents an intervention study on the effects of the combined methods of (1) the Socratic method, (2) Chain of Thought (CoT) reasoning, (3) simplified gamification and (4) formative feedback on university students' Maths learning driven by large language models (LLMs). We call our approach Mathematics Explanations through Games by AI LLMs (MEGA). Some students struggle with Maths and as a result avoid Math-related discipline or subjects despite the importance of Maths across many fields, including signal processing. Oftentimes, students' Maths difficulties stem from suboptimal pedagogy. We compared the MEGA method to the traditional step-by-step (CoT) method to ascertain which is better by using a within-group design after randomly assigning questions for the participants, who are university students. Samples (n=60) were randomly drawn from each of the two test sets of the Grade School Math 8K (GSM8K) and Mathematics Aptitude Test of Heuristics (MATH) datasets, based on the error margin of 11%, the confidence level of 90%, and a manageable number of samples for the student evaluators. These samples were used to evaluate two capable LLMs at length (Generative Pretrained Transformer 4o (GPT4o) and Claude 3.5 Sonnet) out of the initial six that were tested for capability. The results showed that students agree in more instances that the MEGA method is experienced as better for learning for both datasets. It is even much better than the CoT (47.5% compared to 26.67%) in the more difficult MATH dataset, indicating that MEGA is better at explaining difficult Maths problems.