🤖 AI Summary
This work proposes a teacher-in-the-loop multi-agent collaborative system to support mathematics educators in efficiently generating personalized secondary-level math problems that are authentic, readable, mathematically accurate, and well-aligned with real-world contexts. Teachers provide an initial problem prompt and topic; a large language model drafts the problem, which is then evaluated by four specialized agents assessing mathematical correctness, authenticity, readability, and contextual realism. The system enables teachers to iteratively refine and deploy problems, offering unprecedented fine-grained control over generated content. Empirical evaluation on the ASSISTments platform demonstrates that eight teachers produced 212 high-quality problems using the system, significantly reducing mathematical hallucinations and realism gaps while enhancing overall problem quality. Both teachers and students highly valued the system’s capacity for tailoring problems to realistic scenarios.
📝 Abstract
Large language models can increasingly adapt educational tasks to learners characteristics. In the present study, we examine a multi-agent teacher-in-the-loop system for personalizing middle school math problems. The teacher enters a base problem and desired topic, the LLM generates the problem, and then four AI agents evaluate the problem using criteria that each specializes in (mathematical accuracy, authenticity, readability, and realism). Eight middle school mathematics teachers created 212 problems in ASSISTments using the system and assigned these problems to their students. We find that both teachers and students wanted to modify the fine-grained personalized elements of the real-world context of the problems, signaling issues with authenticity and fit. Although the agents detected many issues with realism as the problems were being written, there were few realism issues noted by teachers and students in the final versions. Issues with readability and mathematical hallucinations were also somewhat rare. Implications for multi-agent systems for personalization that support teacher control are given.