🤖 AI Summary
Traditional knowledge component (KC) annotation in programming education relies on labor-intensive expert labeling, hindering scalability and adaptability. Method: We propose KCGen-KT, the first end-to-end large language model (LLM)-driven framework for automatic KC generation and knowledge tracing (KT). It leverages prompt engineering and automated code semantic analysis to produce fine-grained KCs and seamlessly integrates them into KT modeling; interpretability is ensured via performance factor analysis (PFA). Results: Evaluated on real-world programming assignment datasets, KCGen-KT significantly outperforms state-of-the-art KT models. The PFA fit of LLM-generated KCs matches that of human-annotated KCs, and human evaluation confirms their accuracy reaches expert-level proficiency. This work establishes the first fully LLM-empowered pipeline—from automatic KC construction to KT modeling—paving the way for scalable, personalized programming education.
📝 Abstract
Knowledge components (KCs) mapped to problems help model student learning, tracking their mastery levels on fine-grained skills thereby facilitating personalized learning and feedback in online learning platforms. However, crafting and tagging KCs to problems, traditionally performed by human domain experts, is highly labor-intensive. We present a fully automated, LLM-based pipeline for KC generation and tagging for open-ended programming problems. We also develop an LLM-based knowledge tracing (KT) framework to leverage these LLM-generated KCs, which we refer to as KCGen-KT. We conduct extensive quantitative and qualitative evaluations validating the effectiveness of KCGen-KT. On a real-world dataset of student code submissions to open-ended programming problems, KCGen-KT outperforms existing KT methods. We investigate the learning curves of generated KCs and show that LLM-generated KCs have a comparable level-of-fit to human-written KCs under the performance factor analysis (PFA) model. We also conduct a human evaluation to show that the KC tagging accuracy of our pipeline is reasonably accurate when compared to that by human domain experts.