Easy to Learn, Yet Hard to Forget: Towards Robust Unlearning Under Bias

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of machine unlearning in the presence of dataset bias, where models tend to learn spurious correlations, rendering conventional unlearning methods ineffective—particularly for “shortcut” biased samples that are easy to learn yet hard to forget. To tackle this issue, the authors propose the CUPID framework, which leverages differences in the sharpness of sample loss landscapes to partition the forgetting set into causal and bias subsets. CUPID decouples model parameters into distinct causal and bias pathways and applies targeted gradient updates accordingly. Notably, this is the first approach to utilize loss sharpness for both sample partitioning and parameter disentanglement, effectively overcoming the shortcut forgetting problem. Extensive experiments on Waterbirds, BAR, and Biased NICO++ demonstrate that CUPID substantially outperforms existing unlearning methods, achieving state-of-the-art performance.

Technology Category

Application Category

📝 Abstract
Machine unlearning, which enables a model to forget specific data, is crucial for ensuring data privacy and model reliability. However, its effectiveness can be severely undermined in real-world scenarios where models learn unintended biases from spurious correlations within the data. This paper investigates the unique challenges of unlearning from such biased models. We identify a novel phenomenon we term ``shortcut unlearning," where models exhibit an ``easy to learn, yet hard to forget" tendency. Specifically, models struggle to forget easily-learned, bias-aligned samples; instead of forgetting the class attribute, they unlearn the bias attribute, which can paradoxically improve accuracy on the class intended to be forgotten. To address this, we propose CUPID, a new unlearning framework inspired by the observation that samples with different biases exhibit distinct loss landscape sharpness. Our method first partitions the forget set into causal- and bias-approximated subsets based on sample sharpness, then disentangles model parameters into causal and bias pathways, and finally performs a targeted update by routing refined causal and bias gradients to their respective pathways. Extensive experiments on biased datasets including Waterbirds, BAR, and Biased NICO++ demonstrate that our method achieves state-of-the-art forgetting performance and effectively mitigates the shortcut unlearning problem.
Problem

Research questions and friction points this paper is trying to address.

machine unlearning
bias
spurious correlations
shortcut unlearning
data privacy
Innovation

Methods, ideas, or system contributions that make the work stand out.

machine unlearning
bias mitigation
shortcut unlearning
loss landscape sharpness
causal disentanglement
🔎 Similar Papers
No similar papers found.
JuneHyoung Kwon
JuneHyoung Kwon
Chung-Ang University ph.D. student
Weakly supervised learning
M
MiHyeon Kim
KT Corporation
Eunju Lee
Eunju Lee
Chung-Ang University
Y
Yoonji Lee
Department of Artificial Intelligence, Chung-Ang University
S
Seunghoon Lee
Graduate School of Advanced Imaging Sciences, Multimedia and Film, Chung-Ang University
YoungBin Kim
YoungBin Kim
Chung-Ang University
Machine Learning