LLM Unlearning Without an Expert Curated Dataset

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

Existing post-hoc knowledge forgetting methods for large language models (LLMs) rely on manually annotated forgetting datasets, incurring high annotation costs and poor scalability. Method: We propose the first fully automated knowledge forgetting framework: given only a domain name, it employs multi-step structured prompting to guide LLMs in generating diverse, textbook-style, high-quality forgetting data; integrates data diversity optimization with progressive forgetting training for end-to-end automation. Contribution/Results: Our method eliminates the need for expert annotation. On benchmarks spanning biosafety, cybersecurity, and copyright-protected text domains, it achieves forgetting performance comparable to expert-curated data and significantly outperforms baseline approaches. Experimental results demonstrate its effectiveness, generalizability across domains, and practical utility for scalable, cost-efficient model editing.

Technology Category

Application Category

📝 Abstract

Modern large language models often encode sensitive, harmful, or copyrighted knowledge, raising the need for post-hoc unlearning-the ability to remove specific domains of knowledge from a model without full retraining. A major bottleneck in current unlearning pipelines is constructing effective forget sets-datasets that approximate the target domain and guide the model to forget it. In this work, we introduce a scalable, automated approach to generate high-quality forget sets using language models themselves. Our method synthesizes textbook-style data through a structured prompting pipeline, requiring only a domain name as input. Through experiments on unlearning biosecurity, cybersecurity, and Harry Potter novels, we show that our synthetic datasets consistently outperform the baseline synthetic alternatives and are comparable to the expert-curated ones. Additionally, ablation studies reveal that the multi-step generation pipeline significantly boosts data diversity, which in turn improves unlearning utility. Overall, our findings suggest that synthetic datasets offer a promising path toward practical, scalable unlearning for a wide range of emerging domains without the need for manual intervention. We release our code and dataset at https://github.com/xyzhu123/Synthetic_Textbook.

Problem

Research questions and friction points this paper is trying to address.

Removing sensitive knowledge from LLMs without retraining

Automating forget set generation for unlearning pipelines

Enhancing unlearning with synthetic textbook-style data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated synthetic dataset generation for unlearning

Textbook-style data via structured prompting pipeline

Multi-step generation boosts data diversity

🔎 Similar Papers

Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis