🤖 AI Summary
This work investigates the applicability and limitations of Sharpness-Aware Minimization (SAM) in machine unlearning: SAM’s inherent denoising property degrades under forgetting signals, inducing learning bias on retained data and causing heterogeneous test error bounds that scale with signal strength. To address this, we propose Sharp MinMax—a dual-model framework that decouples parameters and disentangles features to separately perform SAM minimization (enhancing generalization on retained data) and sharpness maximization (improving robustness against perturbations on forgotten data). We formally characterize SAM’s “residual signal effect” in unlearning for the first time and design a membership inference attack robustness evaluation protocol. Experiments demonstrate that our method significantly reduces feature coupling between retained and forgotten sets, achieves state-of-the-art unlearning performance across diverse memory strengths, and yields flatter loss landscapes.
📝 Abstract
We characterize the effectiveness of Sharpness-aware minimization (SAM) under machine unlearning scheme, where unlearning forget signals interferes with learning retain signals. While previous work prove that SAM improves generalization with noise memorization prevention, we show that SAM abandons such denoising property when fitting the forget set, leading to various test error bounds depending on signal strength. We further characterize the signal surplus of SAM in the order of signal strength, which enables learning from less retain signals to maintain model performance and putting more weight on unlearning the forget set. Empirical studies show that SAM outperforms SGD with relaxed requirement for retain signals and can enhance various unlearning methods either as pretrain or unlearn algorithm. Observing that overfitting can benefit more stringent sample-specific unlearning, we propose Sharp MinMax, which splits the model into two to learn retain signals with SAM and unlearn forget signals with sharpness maximization, achieving best performance. Extensive experiments show that SAM enhances unlearning across varying difficulties measured by data memorization, yielding decreased feature entanglement between retain and forget sets, stronger resistance to membership inference attacks, and a flatter loss landscape.