🤖 AI Summary
This work addresses the problem of targeted forgetting of sensitive content in large language models (LLMs), focusing on three representative scenarios: long synthetic creative documents, short synthetic biographies containing personally identifiable information (PII), and real training data samples. Methodologically, it systematically integrates white-box and black-box forgetting techniques—including parameter fine-tuning, gradient masking, causal intervention, knowledge editing, and retrieval-augmented forgetting. Its primary contribution is the first standardized LLM forgetting benchmark covering both synthetic and real-world data, with multi-granularity sensitive attributes (e.g., SSN, address, name), enabling verifiable, scenario-aware evaluation. The benchmark has attracted over 100 teams from 30+ institutions globally. Empirical results demonstrate that the proposed approaches effectively suppress sensitive information while preserving model utility and ensuring irrecoverability of forgotten content—advancing safe and controllable LLM development.
📝 Abstract
We introduce SemEval-2025 Task 4: unlearning sensitive content from Large Language Models (LLMs). The task features 3 subtasks for LLM unlearning spanning different use cases: (1) unlearn long form synthetic creative documents spanning different genres; (2) unlearn short form synthetic biographies containing personally identifiable information (PII), including fake names, phone number, SSN, email and home addresses, and (3) unlearn real documents sampled from the target model's training dataset. We received over 100 submissions from over 30 institutions and we summarize the key techniques and lessons in this paper.