🤖 AI Summary
In knowledge tracing, point estimation fails to disentangle students’ true proficiency from behavioral noise, resulting in ambiguous mastery state modeling. To address this, we propose the first NIG-KT framework, which models latent knowledge states at each interaction step using the Normal-Inverse Gaussian (NIG) distribution—explicitly decoupling ability estimation from epistemic and aleatoric uncertainty. We further introduce an NIG-distance-based attention mechanism to enhance sensitivity to learning dynamics and transient performance fluctuations. Additionally, we design a diffusion-driven denoising reconstruction objective coupled with distributional contrastive learning to jointly optimize robust state estimation and uncertainty-aware representation. Extensive experiments on six benchmark datasets demonstrate consistent superiority over state-of-the-art methods: AUC improves by up to 5.85% and ACC by up to 6.89%, significantly enhancing both predictive accuracy and robustness to behavioral noise.
📝 Abstract
Knowledge Tracing (KT) aims to dynamically model a student's mastery of knowledge concepts based on their historical learning interactions. Most current methods rely on single-point estimates, which cannot distinguish true ability from outburst or carelessness, creating ambiguity in judging mastery. To address this issue, we propose a Knowledge Mastery-State Disambiguation for Knowledge Tracing model (KeenKT), which represents a student's knowledge state at each interaction using a Normal-Inverse-Gaussian (NIG) distribution, thereby capturing the fluctuations in student learning behaviors. Furthermore, we design an NIG-distance-based attention mechanism to model the dynamic evolution of the knowledge state. In addition, we introduce a diffusion-based denoising reconstruction loss and a distributional contrastive learning loss to enhance the model's robustness. Extensive experiments on six public datasets demonstrate that KeenKT outperforms SOTA KT models in terms of prediction accuracy and sensitivity to behavioral fluctuations. The proposed method yields the maximum AUC improvement of 5.85% and the maximum ACC improvement of 6.89%.