SAP: Corrective Machine Unlearning with Scaled Activation Projection for Label Noise Robustness

📅 2024-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Label noise severely degrades the generalization performance of machine learning models, especially in large-scale scenarios where labeling is costly and retraining is computationally prohibitive. To address this, we propose Scaling Activation Projection (SAP), a training-free machine unlearning method grounded in Singular Value Decomposition (SVD). SAP is the first approach to apply SVD in the activation space: it identifies trustworthy samples, constructs a clean low-dimensional activation subspace via SVD, and projects noisy activations onto this subspace to dynamically suppress label-noise-induced distortions—enabling weight correction without retraining. Its core innovation lies in trustworthy-sample-driven activation-space dimensionality reduction and projection. Experiments demonstrate that SAP improves generalization accuracy by 6.0% on CIFAR-10 with 25% synthetic label noise, outperforming state-of-the-art noise-robust training methods by an average of +3.2%. On the real-world noisy dataset Clothing1M, SAP boosts ViT accuracy by 2.31%.

Technology Category

Application Category

📝 Abstract
Label corruption, where training samples are mislabeled due to non-expert annotation or adversarial attacks, significantly degrades model performance. Acquiring large, perfectly labeled datasets is costly, and retraining models from scratch is computationally expensive. To address this, we introduce Scaled Activation Projection (SAP), a novel SVD (Singular Value Decomposition)-based corrective machine unlearning algorithm. SAP mitigates label noise by identifying a small subset of trusted samples using cross-entropy loss and projecting model weights onto a clean activation space estimated using SVD on these trusted samples. This process suppresses the noise introduced in activations due to the mislabeled samples. In our experiments, we demonstrate SAP's effectiveness on synthetic noise with different settings and real-world label noise. SAP applied to the CIFAR dataset with 25% synthetic corruption show upto 6% generalization improvements. Additionally, SAP can improve the generalization over noise robust training approaches on CIFAR dataset by ~3.2% on average. Further, we observe generalization improvements of 2.31% for a Vision Transformer model trained on naturally corrupted Clothing1M.
Problem

Research questions and friction points this paper is trying to address.

Machine Learning Robustness
Label Noise
Data Annotation
Innovation

Methods, ideas, or system contributions that make the work stand out.

SAP (Proportional Activation Projection)
Noise Robustness
Model Generalization Improvement
🔎 Similar Papers
No similar papers found.