When unlearning is free: leveraging low influence points to reduce computational costs

📅 2025-12-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional machine unlearning approaches uniformly process all samples slated for removal, incurring prohibitively high computational overhead. Method: This paper proposes a lightweight unlearning framework based on sample influence screening. It employs influence functions to quantify each training sample’s contribution to model outputs, systematically identifying and excluding low-influence samples prior to unlearning—thereby pruning the target dataset. Contribution/Results: We provide the first empirical validation that low-influence samples can be safely ignored without degrading unlearning efficacy or downstream model performance. This paradigm shifts away from uniform treatment of all forget-samples, achieving up to 50% reduction in computational cost across both language and vision tasks, while preserving unlearning accuracy and model utility.

Technology Category

Application Category

📝 Abstract
As concerns around data privacy in machine learning grow, the ability to unlearn, or remove, specific data points from trained models becomes increasingly important. While state of the art unlearning methods have emerged in response, they typically treat all points in the forget set equally. In this work, we challenge this approach by asking whether points that have a negligible impact on the model's learning need to be removed. Through a comparative analysis of influence functions across language and vision tasks, we identify subsets of training data with negligible impact on model outputs. Leveraging this insight, we propose an efficient unlearning framework that reduces the size of datasets before unlearning leading to significant computational savings (up to approximately 50 percent) on real world empirical examples.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational costs in machine unlearning by identifying low-impact data points.
Proposing an efficient unlearning framework that minimizes dataset size before removal.
Challenging equal treatment of all data points in existing unlearning methods.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Identifies low influence data points using influence functions
Reduces dataset size before unlearning to save computation
Achieves up to 50% computational savings in real-world examples
🔎 Similar Papers
No similar papers found.