🤖 AI Summary
To address low efficiency, limited semantic diversity, and weak alignment with pedagogical objectives in personalized educational question generation, this paper proposes a multi-agent collaborative framework. It comprises five specialized agents—planning, generation, solving, pedagogical evaluation, and verification—organized into a closed-loop iterative system. The framework integrates structured task planning, a multi-dimensional binary-scoring evaluation model, and a dynamic feedback mechanism. Compared to conventional single-agent or rule-based approaches, it significantly enhances question quality stability, semantic diversity, and pedagogical alignment. Experiments on two benchmark mathematics question datasets demonstrate substantial improvements over state-of-the-art methods: +32.7% in diversity, +28.4% in pedagogical objective alignment, and +25.1% in overall quality. This work establishes a novel paradigm for automated, scalable, and adaptive generation of high-quality educational resources.
📝 Abstract
High-quality personalized question banks are crucial for supporting adaptive learning and individualized assessment. Manually designing questions is time-consuming and often fails to meet diverse learning needs, making automated question generation a crucial approach to reduce teachers' workload and improve the scalability of educational resources. However, most existing question generation methods rely on single-agent or rule-based pipelines, which still produce questions with unstable quality, limited diversity, and insufficient alignment with educational goals. To address these challenges, we propose EduAgentQG, a multi-agent collaborative framework for generating high-quality and diverse personalized questions. The framework consists of five specialized agents and operates through an iterative feedback loop: the Planner generates structured design plans and multiple question directions to enhance diversity; the Writer produces candidate questions based on the plan and optimizes their quality and diversity using feedback from the Solver and Educator; the Solver and Educator perform binary scoring across multiple evaluation dimensions and feed the evaluation results back to the Writer; the Checker conducts final verification, including answer correctness and clarity, ensuring alignment with educational goals. Through this multi-agent collaboration and iterative feedback loop, EduAgentQG generates questions that are both high-quality and diverse, while maintaining consistency with educational objectives. Experiments on two mathematics question datasets demonstrate that EduAgentQG outperforms existing single-agent and multi-agent methods in terms of question diversity, goal consistency, and overall quality.