🤖 AI Summary
Existing approximate unlearning methods (e.g., Fisher, GA, LoReUn) suffer from instability and insufficient accuracy—particularly when forgetting few or imbalanced samples—in large language models (LLMs). To address this challenge, we propose Influence-Driven Parameter Reweighting (IDPR), a training-free framework that enables precise, selective parameter correction. IDPR introduces a novel single-sample influence estimation module to rapidly quantify each sample’s parameter-level impact, then generates adaptive, fine-grained update weights accordingly. Evaluated on Mistral-7B and Llama-3-8B across Dolly-15k and Alpaca-57k, IDPR achieves unlearning efficiency 100× higher than full retraining, while significantly outperforming baselines on both in-distribution and out-of-distribution forgetting tasks. Crucially, it preserves the model’s general capabilities without degradation.
📝 Abstract
Removing specific data influence from large language models (LLMs) remains challenging, as retraining is costly and existing approximate unlearning methods are often unstable. The challenge is exacerbated when the forget set is small or imbalanced. We introduce RapidUn, an influence-driven and parameter-efficient unlearning framework. It first estimates per-sample influence through a fast estimation module, then maps these scores into adaptive update weights that guide selective parameter updates -- forgetting harmful behavior while retaining general knowledge. On Mistral-7B and Llama-3-8B across Dolly-15k and Alpaca-57k, RapidUn achieves up to 100 times higher efficiency than full retraining and consistently outperforms Fisher, GA, and LoReUn on both in-distribution and out-of-distribution forgetting. These results establish influence-guided parameter reweighting as a scalable and interpretable paradigm for LLM unlearning.