Machine Unlearning for Masked Diffusion Language Models

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the lack of effective machine unlearning methods for masked diffusion language models (MDLMs) by proposing the Masked Diffusion Unlearning (MDU) framework. MDU introduces the first unlearning mechanism specifically designed for MDLMs, formulating knowledge removal as a distributional rollback within the diffusion process. At each masked position, it erases target knowledge by minimizing the forward KL divergence between prompt-conditioned predictions and an unconditional anchor distribution. A temperature scaling parameter is incorporated to balance privacy preservation against model utility. Experimental results demonstrate that MDU consistently outperforms existing unlearning approaches for large language models across multiple standard benchmarks and MDLM architectures, achieving a superior trade-off between forgetting efficacy and model functionality.

📝 Abstract

Recent masked diffusion language models (MDLMs), such as LLaDA and Dream, have achieved performance comparable to autoregressive large language models. Unlike autoregressive models, which generate text sequentially, MDLMs generate text by iteratively denoising masked positions in parallel. During fine-tuning, MDLMs learn to recover responses from masked response states conditioned on a prompt, thereby shifting their predictions from a prompt-masked unconditional distribution toward a prompt-conditional distribution. Despite this distinct generative and fine-tuning mechanism, machine unlearning for MDLMs remains largely unexplored. In this paper, we propose Masked Diffusion Unlearning (MDU), the first unlearning framework for MDLMs, by revisiting the process of learning specific knowledge in terms of diffusion. Specifically, MDU minimizes a forward KL divergence from the prompt-conditional prediction to a prompt-masked unconditional anchor at every masked response position, with a temperature scaling parameter to control the privacy-utility trade-off. Our empirical results on standard benchmarks and MDLM backbones show that MDU achieves high unlearning performance compared to existing LLM unlearning methods. Code is available at https://github.com/leegeoru/MDU.

Problem

Research questions and friction points this paper is trying to address.

Machine Unlearning

Masked Diffusion Language Models

Knowledge Removal

Privacy

LLM Unlearning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked Diffusion Language Models

Machine Unlearning

Forward KL Divergence