Denoising Programming Knowledge Tracing with a Code Graph-based Tuning Adaptor

📅 2025-06-07
🏛️ Knowledge Discovery and Data Mining
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Programming Knowledge Tracing (PKT) suffers from two types of noise in long student submission sequences: irrelevant submissions introducing spurious signals and minor code modifications yielding weak discriminative signals—both severely degrading model accuracy and robustness. To address this, we propose the first code-graph-based tuning adapter framework, which integrates a semantics-aware noise identification mechanism with a cluster-aware Graph Convolutional Network (GCN) to enhance weak-signal discrimination—yielding a model-agnostic, plug-and-play noise correction solution. Furthermore, we introduce dual-noise feature constraints and navigation regularization to improve generalization. Evaluated on four real-world programming education datasets, our method consistently outperforms state-of-the-art baselines, achieving significant gains in both knowledge state prediction accuracy and robustness against diverse noise patterns.

Technology Category

Application Category

📝 Abstract
Programming Knowledge Tracking (PKT) aims to dynamically diagnose learners' mastery levels of programming knowledge based on their coding activities, facilitating more effective and personalized programming education. However, current PKT studies primarily focus on the implicit relationship between code content and knowledge assessment, often overlooking two types of noise signals in long-term programming activities: unwanted signals from unrelated submissions and weak signals from minor modifications. This practical challenge significantly limits model performance and application. To address this issue, we propose Coda, a Code graph-based tuning adaptor designed to enhance existing PKT models by identifying and mitigating the impact of noise. Specifically, Coda first transforms the loose code sequences submitted by each learner into a compact code graph. By leveraging this code graph, unwanted signals can be identified from a semantic similarity perspective. We then apply a cluster-aware GCN to the code graph, which improves the discrimination of weak signals and enables their clustering for identification. Finally, a lightweight yet effective adaptor is incorporated into the PKT task through optimization with two noise feature-based constraints and a navigational regularization term, to correct knowledge states affected by noise. It is worth mentioning that the Coda framework is model-agnostic and can be adapted to most existing PKT solutions. Extensive experimental results on four real-world datasets demonstrate that Coda effectively performs the PKT task in the presence of noisy programming records, outperforming typical baselines.
Problem

Research questions and friction points this paper is trying to address.

Identify and reduce noise in programming knowledge tracking
Transform code sequences into graphs for semantic analysis
Enhance existing models with noise-aware adaptors for accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transforms code sequences into compact code graphs
Uses cluster-aware GCN to improve signal discrimination
Incorporates lightweight adaptor with noise constraints
🔎 Similar Papers
No similar papers found.
W
Weibo Gao
State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China
Q
Qi Liu
State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
R
Rui Li
State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China
Y
Yuze Zhao
State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China
H
Hao Wang
State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China
Linan Yue
Linan Yue
Southeast University
Trustworthy AINatural Language Processing
Fangzhou Yao
Fangzhou Yao
University of Illinois at Urbana-Champaign
Cloud ComputingDistributed Systems
Z
Zheng Zhang
State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China