🤖 AI Summary
Programming Knowledge Tracing (PKT) suffers from two types of noise in long student submission sequences: irrelevant submissions introducing spurious signals and minor code modifications yielding weak discriminative signals—both severely degrading model accuracy and robustness. To address this, we propose the first code-graph-based tuning adapter framework, which integrates a semantics-aware noise identification mechanism with a cluster-aware Graph Convolutional Network (GCN) to enhance weak-signal discrimination—yielding a model-agnostic, plug-and-play noise correction solution. Furthermore, we introduce dual-noise feature constraints and navigation regularization to improve generalization. Evaluated on four real-world programming education datasets, our method consistently outperforms state-of-the-art baselines, achieving significant gains in both knowledge state prediction accuracy and robustness against diverse noise patterns.
📝 Abstract
Programming Knowledge Tracking (PKT) aims to dynamically diagnose learners' mastery levels of programming knowledge based on their coding activities, facilitating more effective and personalized programming education. However, current PKT studies primarily focus on the implicit relationship between code content and knowledge assessment, often overlooking two types of noise signals in long-term programming activities: unwanted signals from unrelated submissions and weak signals from minor modifications. This practical challenge significantly limits model performance and application. To address this issue, we propose Coda, a Code graph-based tuning adaptor designed to enhance existing PKT models by identifying and mitigating the impact of noise. Specifically, Coda first transforms the loose code sequences submitted by each learner into a compact code graph. By leveraging this code graph, unwanted signals can be identified from a semantic similarity perspective. We then apply a cluster-aware GCN to the code graph, which improves the discrimination of weak signals and enables their clustering for identification. Finally, a lightweight yet effective adaptor is incorporated into the PKT task through optimization with two noise feature-based constraints and a navigational regularization term, to correct knowledge states affected by noise. It is worth mentioning that the Coda framework is model-agnostic and can be adapted to most existing PKT solutions. Extensive experimental results on four real-world datasets demonstrate that Coda effectively performs the PKT task in the presence of noisy programming records, outperforming typical baselines.