LUME: LLM Unlearning with Multitask Evaluations

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently removing copyrighted, sensitive, and private content from large language models (LLMs) without full retraining. We propose a novel model unlearning paradigm and introduce LUME—the first multi-task unlearning benchmark covering creative text, synthetically generated sensitive biographies, and real-world public-figure data—accompanied by open 1B- and 7B-parameter target models. We design a three-dimensional unified evaluation framework with fine-grained quantitative metrics to systematically assess state-of-the-art unlearning algorithms. Empirical results reveal that existing methods achieve only 62% average functional unlearning success, while significantly increasing hallucination (+18%) and degrading commonsense reasoning accuracy (−14%), exposing a fundamental trade-off between semantic consistency and knowledge retention. This work establishes the first empirically grounded, reproducible, and multidimensional unlearning evaluation benchmark for trustworthy AI governance.

Technology Category

Application Category

📝 Abstract
Unlearning aims to remove copyrighted, sensitive, or private content from large language models (LLMs) without a full retraining. In this work, we develop a multi-task unlearning benchmark (LUME) which features three tasks: (1) unlearn synthetically generated creative short novels, (2) unlearn synthetic biographies with sensitive information, and (3) unlearn a collection of public biographies. We further release two fine-tuned LLMs of 1B and 7B parameter sizes as the target models. We conduct detailed evaluations of several recently proposed unlearning algorithms and present results on carefully crafted metrics to understand their behavior and limitations.
Problem

Research questions and friction points this paper is trying to address.

Remove sensitive content from LLMs
Develop multi-task unlearning benchmark
Evaluate unlearning algorithms' effectiveness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-task unlearning benchmark
Fine-tuned LLMs release
Detailed unlearning algorithms evaluation
🔎 Similar Papers
No similar papers found.