Large Language Model Unlearning for Source Code

📅 2025-06-20

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

Existing natural language forgetting methods fail to address targeted unlearning of sensitive, outdated, or insecure training data in large language models (LLMs) for code, causing severe degradation in code generation performance. Method: We propose PROD—the first collaborative unlearning framework tailored for source code LLMs. It integrates syntax-aware forgetting objective modeling with output distribution regulation to jointly suppress unwanted behaviors and enhance desired candidates. Contributions/Results: (1) A novel probabilistic suppression and candidate augmentation mechanism grounded in program syntax; (2) The first code-specific unlearning benchmark covering copyright, security, and deprecated API removal tasks; (3) A lightweight fine-tuning framework compatible with diverse code LLMs (e.g., CodeLlama, StarCoder). Experiments show PROD improves forgetting accuracy by 32–58%, incurs <2.1% Pass@1 degradation, and provides robustness against adversarial attacks and formal zero-information-leakage guarantees.

Technology Category

Application Category

📝 Abstract

LLM4SE has demonstrated significant success, but LLMs'potential memorization of sensitive or outdated training data introduces critical risks to legal compliance, software security, and code quality. LLM unlearning techniques, which can eliminate the influence of undesired data from LLMs in a post-training way, present a promising solution to address these concerns. While recent efforts in LLM unlearning show effectiveness in natural language, their applicability to source code remains underexplored. Our empirical study reveals that existing LLM unlearning approaches, when applied to source code, cause severe model utility degradation, rendering models practically unusable for code generation. In this paper, we propose PROD, a novel unlearning approach that enables LLMs to forget undesired code content while effectively preserving their code generation capabilities. PROD suppresses the probability of forget data in LLMs'output distribution while promoting candidate distributional components, enabling the model to jointly learn to forget specific content and retain its general capabilities. To facilitate this study, we establish a benchmark for code unlearning evaluation, which includes three critical downstream tasks: copyrighted code unlearning, insecure code unlearning, and deprecated API unlearning. Our evaluation demonstrates that PROD achieves superior balance between forget quality and model utility compared to existing unlearning approaches across three downstream tasks, while consistently exhibiting improvements when applied to LLMs of varying series. PROD also exhibits superior robustness against adversarial attacks without generating or exposing the data to be forgotten. The results underscore that our approach not only extends the application boundary of unlearning techniques to source code, but also holds significant implications for advancing reliable code generation.

Problem

Research questions and friction points this paper is trying to address.

Addressing LLM memorization of sensitive or outdated source code data

Exploring unlearning techniques for source code without utility degradation

Developing a benchmark for evaluating code unlearning effectiveness

Innovation

Methods, ideas, or system contributions that make the work stand out.

PROD enables LLMs to forget undesired code content

PROD preserves code generation capabilities effectively

PROD robust against adversarial attacks without data exposure

🔎 Similar Papers

No similar papers found.