LUME: LLM Unlearning with Multitask Evaluations

📅 2025-02-20

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the challenge of efficiently removing copyrighted, sensitive, and private content from large language models (LLMs) without full retraining. We propose a novel model unlearning paradigm and introduce LUME—the first multi-task unlearning benchmark covering creative text, synthetically generated sensitive biographies, and real-world public-figure data—accompanied by open 1B- and 7B-parameter target models. We design a three-dimensional unified evaluation framework with fine-grained quantitative metrics to systematically assess state-of-the-art unlearning algorithms. Empirical results reveal that existing methods achieve only 62% average functional unlearning success, while significantly increasing hallucination (+18%) and degrading commonsense reasoning accuracy (−14%), exposing a fundamental trade-off between semantic consistency and knowledge retention. This work establishes the first empirically grounded, reproducible, and multidimensional unlearning evaluation benchmark for trustworthy AI governance.

Technology Category

Application Category

📝 Abstract

Unlearning aims to remove copyrighted, sensitive, or private content from large language models (LLMs) without a full retraining. In this work, we develop a multi-task unlearning benchmark (LUME) which features three tasks: (1) unlearn synthetically generated creative short novels, (2) unlearn synthetic biographies with sensitive information, and (3) unlearn a collection of public biographies. We further release two fine-tuned LLMs of 1B and 7B parameter sizes as the target models. We conduct detailed evaluations of several recently proposed unlearning algorithms and present results on carefully crafted metrics to understand their behavior and limitations.

Problem

Research questions and friction points this paper is trying to address.

Remove sensitive content from LLMs

Develop multi-task unlearning benchmark

Evaluate unlearning algorithms' effectiveness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-task unlearning benchmark

Fine-tuned LLMs release

Detailed unlearning algorithms evaluation

🔎 Similar Papers

Unlearnable Algorithms for In-context Learning