Opt-Out: Investigating Entity-Level Unlearning for Large Language Models via Optimal Transport

📅 2024-06-18

📈 Citations: 2

✨ Influential: 0

career value

153K/year

🤖 AI Summary

This work addresses user-level privacy protection in large language models (LLMs) by formalizing the entity-level machine unlearning problem: precisely erasing all knowledge associated with a specific entity (e.g., a user) while preserving the model’s general capabilities. To this end, we first rigorously define the entity-level unlearning task and introduce ELUDe—a dedicated benchmark for evaluation. We then propose a novel optimal transport–driven unlearning mechanism grounded in the Wasserstein distance, enabling provably sound, fine-grained, parameter-space interventions without full retraining. Experiments on ELUDe demonstrate that our method substantially outperforms existing approaches, achieving high unlearning completeness and minimal degradation on downstream tasks. The framework thus provides an efficient, verifiable, and compliance-ready solution for regulatory data deletion requirements.

Technology Category

Application Category

📝 Abstract

Instruction-following large language models (LLMs), such as ChatGPT, have become widely popular among everyday users. However, these models inadvertently disclose private, sensitive information to their users, underscoring the need for machine unlearning techniques to remove selective information from the models. While prior work has focused on forgetting small, random subsets of training data at the instance-level, we argue that real-world scenarios often require the removal of an entire user data, which may require a more careful maneuver. In this study, we explore entity-level unlearning, which aims to erase all knowledge related to a target entity while preserving the remaining model capabilities. To address this, we introduce Opt-Out, an optimal transport-based unlearning method that utilizes the Wasserstein distance from the model's initial parameters to achieve more effective and fine-grained unlearning. We also present the first Entity-Level Unlearning Dataset (ELUDe) designed to evaluate entity-level unlearning. Our empirical results demonstrate that Opt-Out surpasses existing methods, establishing a new standard for secure and adaptable LLMs that can accommodate user data removal requests without the need for full retraining.

Problem

Research questions and friction points this paper is trying to address.

Remove private user data from large language models

Achieve entity-level unlearning without full retraining

Preserve model capabilities while erasing target entities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal transport-based unlearning method

Entity-Level Unlearning Dataset (ELUDe)

Wasserstein distance for fine-grained unlearning

🔎 Similar Papers

Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis