BLUR: A Bi-Level Optimization Approach for LLM Unlearning

📅 2025-06-09

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Existing weighted-loss approaches for compliant knowledge unlearning in large language models (LLMs) struggle to simultaneously ensure thorough forgetting and preserve model utility. Method: We formulate model unlearning as a bilevel optimization problem with explicit priority: the lower-level strictly minimizes forgetting loss to erase target knowledge or capabilities, while the upper-level optimizes retained performance subject to forgetting constraints. Leveraging implicit function gradients and efficient Hessian-vector product approximations, we propose the first theoretically grounded, architecture-aware, end-to-end bilevel unlearning framework. Contribution/Results: Our method achieves state-of-the-art performance across diverse benchmarks—spanning multiple tasks, models, and evaluation metrics—demonstrating significant improvements in both forgetting completeness and utility retention, with provable convergence guarantees and superior synergy between these competing objectives.

Technology Category

Application Category

📝 Abstract

Enabling large language models (LLMs) to unlearn knowledge and capabilities acquired during training has proven vital for ensuring compliance with data regulations and promoting ethical practices in generative AI. Although there are growing interests in developing various unlearning algorithms, it remains unclear how to best formulate the unlearning problem. The most popular formulation uses a weighted sum of forget and retain loss, but it often leads to performance degradation due to the inherent trade-off between forget and retain losses. In this work, we argue that it is important to model the hierarchical structure of the unlearning problem, where the forget problem (which extit{unlearns} certain knowledge and/or capabilities) takes priority over the retain problem (which preserves model utility). This hierarchical structure naturally leads to a bi-level optimization formulation where the lower-level objective focuses on minimizing the forget loss, while the upper-level objective aims to maintain the model's utility. Based on this new formulation, we propose a novel algorithm, termed Bi-Level UnleaRning ( exttt{BLUR}), which not only possesses strong theoretical guarantees but more importantly, delivers superior performance. In particular, our extensive experiments demonstrate that exttt{BLUR} consistently outperforms all the state-of-the-art algorithms across various unlearning tasks, models, and metrics. Codes are available at https://github.com/OptimAI-Lab/BLURLLMUnlearning.

Problem

Research questions and friction points this paper is trying to address.

Optimizing LLM unlearning with bi-level hierarchy

Balancing forget and retain loss trade-offs

Enhancing model compliance and ethical AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-level optimization for LLM unlearning

Prioritizes forget loss over retain loss

Theoretical guarantees and superior performance

🔎 Similar Papers

Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis