PrivacyCD: Hierarchical Unlearning for Protecting Student Privacy in Cognitive Diagnosis

📅 2025-11-06

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Current cognitive diagnosis (CD) models lack built-in privacy-preserving mechanisms, and generic machine unlearning algorithms fail to accommodate their heterogeneous parameter structures, resulting in insecure and inefficient student data removal. This paper presents the first systematic study of data unlearning for CD models, proposing Hierarchical Importance-guided Forgetting (HIF). HIF integrates instance-level and layer-level parameter importance estimation with a smooth update strategy to enable precise, controllable parameter revision. Evaluated on three real-world educational datasets, HIF significantly outperforms baseline methods, achieving an optimal trade-off among unlearning completeness (ΔAUC < 0.01), model utility (prediction accuracy degradation < 1.2%), and computational efficiency (speedup up to 2.3×). This work establishes the first verifiable, CD-specific unlearning framework for privacy-compliant data governance in educational AI.

Technology Category

Application Category

📝 Abstract

The need to remove specific student data from cognitive diagnosis (CD) models has become a pressing requirement, driven by users'growing assertion of their"right to be forgotten". However, existing CD models are largely designed without privacy considerations and lack effective data unlearning mechanisms. Directly applying general purpose unlearning algorithms is suboptimal, as they struggle to balance unlearning completeness, model utility, and efficiency when confronted with the unique heterogeneous structure of CD models. To address this, our paper presents the first systematic study of the data unlearning problem for CD models, proposing a novel and efficient algorithm: hierarchical importanceguided forgetting (HIF). Our key insight is that parameter importance in CD models exhibits distinct layer wise characteristics. HIF leverages this via an innovative smoothing mechanism that combines individual and layer, level importance, enabling a more precise distinction of parameters associated with the data to be unlearned. Experiments on three real world datasets show that HIF significantly outperforms baselines on key metrics, offering the first effective solution for CD models to respond to user data removal requests and for deploying high-performance, privacy preserving AI systems

Problem

Research questions and friction points this paper is trying to address.

Developing data removal mechanisms for cognitive diagnosis models to protect student privacy

Addressing the challenge of balancing unlearning completeness with model utility and efficiency

Proposing hierarchical importance-guided forgetting to handle CD models' heterogeneous structure

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical importance-guided forgetting algorithm for unlearning

Smoothing mechanism combining individual and layer-level importance

Precise parameter distinction for data removal in cognitive diagnosis

🔎 Similar Papers

Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models