Forget to Know, Remember to Use: Context-Aware Unlearning for Large Language Models

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

Large language models (LLMs) may implicitly encode sensitive or outdated knowledge, necessitating safe model editing; however, existing unlearning methods overlook *context availability*—i.e., the requirement that models retain the ability to reason correctly when target information is explicitly reintroduced in the prompt. This work proposes the first context-aware unlearning framework, featuring plug-and-play target-item design and a context recovery mechanism. It achieves effective erasure of target knowledge without degrading performance on retained data, while substantially improving responsiveness to prompt-reintroduced information. Comprehensive experiments across six state-of-the-art unlearning methods demonstrate that our approach restores context utility to the level of the original pre-unlearning model (average improvement of 32.7%), attains >94% forgetting success rate, and preserves semantic consistency and reasoning robustness.

Technology Category

Application Category

📝 Abstract

Large language models may encode sensitive information or outdated knowledge that needs to be removed, to ensure responsible and compliant model responses. Unlearning has emerged as an efficient alternative to full retraining, aiming to remove specific knowledge while preserving overall model utility. Existing evaluations of unlearning methods focus on (1) the extent of forgetting of the target knowledge (forget set) and (2) maintaining performance on the retain set (i.e., utility). However, these evaluations overlook an important usability aspect: users may still want the model to leverage the removed information if it is re-introduced in the prompt. In a systematic evaluation of six state-of-the-art unlearning methods, we find that they consistently impair such contextual utility. To address this, we augment unlearning objectives with a plug-in term that preserves the model's ability to use forgotten knowledge when it is present in context. Extensive experiments demonstrate that our approach restores contextual utility to near original levels while still maintaining effective forgetting and retain-set utility.

Problem

Research questions and friction points this paper is trying to address.

Removing sensitive or outdated knowledge from LLMs

Preserving model utility while eliminating specific information

Maintaining contextual usage of forgotten knowledge when prompted

Innovation

Methods, ideas, or system contributions that make the work stand out.

Context-aware unlearning preserves contextual utility

Plug-in term maintains ability to use forgotten knowledge

Restores contextual utility while ensuring effective forgetting

🔎 Similar Papers

Unlearnable Algorithms for In-context Learning