π€ AI Summary
Existing natural language forgetting methods fail to address targeted unlearning of sensitive, outdated, or insecure training data in large language models (LLMs) for code, causing severe degradation in code generation performance.
Method: We propose PRODβthe first collaborative unlearning framework tailored for source code LLMs. It integrates syntax-aware forgetting objective modeling with output distribution regulation to jointly suppress unwanted behaviors and enhance desired candidates.
Contributions/Results: (1) A novel probabilistic suppression and candidate augmentation mechanism grounded in program syntax; (2) The first code-specific unlearning benchmark covering copyright, security, and deprecated API removal tasks; (3) A lightweight fine-tuning framework compatible with diverse code LLMs (e.g., CodeLlama, StarCoder). Experiments show PROD improves forgetting accuracy by 32β58%, incurs <2.1% Pass@1 degradation, and provides robustness against adversarial attacks and formal zero-information-leakage guarantees.
π Abstract
LLM4SE has demonstrated significant success, but LLMs'potential memorization of sensitive or outdated training data introduces critical risks to legal compliance, software security, and code quality. LLM unlearning techniques, which can eliminate the influence of undesired data from LLMs in a post-training way, present a promising solution to address these concerns. While recent efforts in LLM unlearning show effectiveness in natural language, their applicability to source code remains underexplored. Our empirical study reveals that existing LLM unlearning approaches, when applied to source code, cause severe model utility degradation, rendering models practically unusable for code generation. In this paper, we propose PROD, a novel unlearning approach that enables LLMs to forget undesired code content while effectively preserving their code generation capabilities. PROD suppresses the probability of forget data in LLMs'output distribution while promoting candidate distributional components, enabling the model to jointly learn to forget specific content and retain its general capabilities. To facilitate this study, we establish a benchmark for code unlearning evaluation, which includes three critical downstream tasks: copyrighted code unlearning, insecure code unlearning, and deprecated API unlearning. Our evaluation demonstrates that PROD achieves superior balance between forget quality and model utility compared to existing unlearning approaches across three downstream tasks, while consistently exhibiting improvements when applied to LLMs of varying series. PROD also exhibits superior robustness against adversarial attacks without generating or exposing the data to be forgotten. The results underscore that our approach not only extends the application boundary of unlearning techniques to source code, but also holds significant implications for advancing reliable code generation.