Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

πŸ“… 2024-12-09
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 29
✨ Influential: 1
πŸ“„ PDF
πŸ€– AI Summary
Machine unlearning techniques fail to meet legal and ethical requirements for privacy erasure, copyright compliance, and content suppression in generative AI, reflecting a fundamental misalignment between technical capabilities and policy objectives. Method: We propose the first policy-oriented conceptual framework for machine unlearning, rigorously distinguishing parameter-level information removal from output-level behavioral suppression, and exposing its inherent limitations as a general-purpose compliance tool. Integrating technical feasibility analysis, legal theory, and AI governance practice, we conduct interdisciplinary conceptual modeling and root-cause analysis of key challenges. Contribution/Results: The study clarifies the precise applicability boundaries of machine unlearning, establishes a more rigorous, cross-disciplinary technical discourse between machine learning, law, and policy, and advances pragmatic collaboration pathways for AI regulation. This framework enables precise alignment of technical interventions with normative goalsβ€”critical for accountable, rights-respecting AI deployment.

Technology Category

Application Category

πŸ“ Abstract
We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have for law and policy. These aspirations are both numerous and varied, motivated by issues that pertain to privacy, copyright, safety, and more. For example, unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model's parameters, e.g., a particular individual's personal data or in-copyright expression of Spiderman that was included in the model's training data. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs, e.g., generations that closely resemble a particular individual's data or reflect the concept of"Spiderman."Both of these goals--the targeted removal of information from a model and the targeted suppression of information from a model's outputs--present various technical and substantive challenges. We provide a framework for thinking rigorously about these challenges, which enables us to be clear about why unlearning is not a general-purpose solution for circumscribing generative-AI model behavior in service of broader positive impact. We aim for conceptual clarity and to encourage more thoughtful communication among machine learning (ML), law, and policy experts who seek to develop and apply technical methods for compliance with policy objectives.
Problem

Research questions and friction points this paper is trying to address.

Machine unlearning fails to effectively remove problematic content from AI models
It cannot reliably suppress targeted information in generative AI outputs
Unlearning presents technical mismatches between goals and feasible implementations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine unlearning removes problematic content from models
It suppresses targeted information in model outputs
Framework addresses mismatches between goals and implementations
πŸ”Ž Similar Papers
No similar papers found.
A
A. F. Cooper
The GenLaw Center, Microsoft Research, Stanford University
Christopher A. Choquette-Choo
Christopher A. Choquette-Choo
OpenAI
machine learningtrustworthy machine learningdata privacyadversarial machine learningsecurity
M
Miranda Bogen
Center for Democracy & Technology, Princeton CITP
Matthew Jagielski
Matthew Jagielski
Anthropic
adversarial machine learningdifferential privacysecurity
Katja Filippova
Katja Filippova
Google DeepMind
Natural Language ProcessingComputational Linguistics
Ken Ziyu Liu
Ken Ziyu Liu
Stanford University
Machine Learning
Alexandra Chouldechova
Alexandra Chouldechova
Researcher @ MSR NYC FATE
Jamie Hayes
Jamie Hayes
Google DeepMind
Yangsibo Huang
Yangsibo Huang
Google DeepMind
Machine Learning
N
Niloofar Mireshghallah
University of Washington
Ilia Shumailov
Ilia Shumailov
AI Sequrity Company
Machine LearningComputer SecurityAdversarial Machine LearningAI Security
Eleni Triantafillou
Eleni Triantafillou
Google DeepMind
Machine LearningFew-shot LearningMeta-Learning
Peter Kairouz
Peter Kairouz
Research Scientist, Google
Differential PrivacyFederated LearningArtificial IntelligenceMachine LearningInformation Theory
N
N. Mitchell
Google Research
Percy Liang
Percy Liang
Associate Professor of Computer Science, Stanford University
machine learningnatural language processing
Daniel E. Ho
Daniel E. Ho
Stanford University
Regulatory policyartificial intelligenceadministrative lawantidiscrimination
Yejin Choi
Yejin Choi
Stanford University / NVIDIA
Natural Language ProcessingDeep LearningArtificial IntelligenceCommonsense Reasoning
Sanmi Koyejo
Sanmi Koyejo
Assistant Professor, Stanford University
Machine LearningHealthcare AINeuroinformatics
F
Fernando Delgado
Lighthouse
James Grimmelmann
James Grimmelmann
The GenLaw Center, Cornell Tech, Cornell Law School
Vitaly Shmatikov
Vitaly Shmatikov
Cornell Tech
Christopher De Sa
Christopher De Sa
Associate Professor of Computer Science, Cornell University
machine learning systems
Solon Barocas
Solon Barocas
Microsoft Research; Cornell University
A
A. Cyphert
West Virginia University, College of Law
M
Mark A. Lemley
Stanford Law School
danah boyd
danah boyd
Partner Researcher, Microsoft Research; Distinguished Visiting Professor, Georgetown University
Algorithmic AccountabilitySocial MediaInternet StudiesBig Data
Jennifer Wortman Vaughan
Jennifer Wortman Vaughan
Senior Principal Research Manager, Microsoft Research, New York City
AI TransparencyAI FairnessResponsible AIMachine LearningAlgorithmic Economics
M
M. Brundage
David Bau
David Bau
Assistant Professor at Northeastern University
Machine LearningComputer VisionNLPSoftware EngineeringHCI
Seth Neel
Seth Neel
Google
computer sciencemachine learningprivacyfairness
A
Abigail Z. Jacobs
University of Michigan
Andreas Terzis
Andreas Terzis
Google Deepmind
Computer NetworksMachine LearningPrivacySecurity
Hanna Wallach
Hanna Wallach
VP & Distinguished Scientist, Microsoft Research
AI Evaluation & MeasurementResponsible AIComputational Social ScienceMLNLP
Nicolas Papernot
Nicolas Papernot
University of Toronto and Vector Institute
Computer SecurityDeep LearningData Privacy
Katherine Lee
Katherine Lee
Researcher, OpenAI
natural language processingmachine learningprivacyml securitytech policy