🤖 AI Summary
This work addresses the challenge of automatically generating high-quality multiple-choice questions (MCQs) with interpretable and accurate difficulty estimates in adaptive intelligent tutoring systems. The authors propose a structured knowledge-driven approach that first leverages large language models to construct a knowledge graph from educational documents, then systematically generates MCQs and distractors based on the graph’s nodes and relations. Crucially, the method integrates nine interpretable difficulty signals into a data-driven model for predicting question difficulty. By synergistically combining knowledge graphs, large language models, and multi-signal difficulty modeling—demonstrated here for the first time—the framework achieves both high item quality and difficulty assessments that closely align with human perception while remaining transparent and explainable.
📝 Abstract
Generating multiple-choice questions (MCQs) with difficulty estimation remains challenging in automated MCQ-generation systems used in adaptive, AI-assisted education. This study proposes a novel methodology for generating MCQs with difficulty estimation from the input documents by utilizing knowledge graphs (KGs) and large language models (LLMs). Our approach uses an LLM to construct a KG from input documents, from which MCQs are then systematically generated. Each MCQ is generated by selecting a node from the KG as the key, sampling a related triple or quintuple -- optionally augmented with an extra triple -- and prompting an LLM to generate a corresponding stem from these graph components. Distractors are then selected from the KG. For each MCQ, nine difficulty signals are computed and combined into a unified difficulty score using a data-driven approach. Experimental results demonstrate that our method generates high-quality MCQs whose difficulty estimation is interpretable and aligns with human perceptions. Our approach improves automated MCQ generation by integrating structured knowledge representations with LLMs and a data-driven difficulty estimation model.