Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models

📅 2024-05-21

🏛️ Neural Information Processing Systems

📈 Citations: 11

✨ Influential: 1

career value

216K/year

🤖 AI Summary

This work addresses the privacy risks posed by visual concept leakage in multimodal large language models (MLLMs), presenting the first systematic study of machine unlearning for such models. We propose Single-Image Unlearning (SIU), a lightweight fine-tuning method that achieves complete erasure of a target visual concept using only one representative image. SIU introduces a novel multi-faceted, fine-grained data construction strategy and jointly optimizes concept-level forgetting via dual-masked KL divergence loss and cross-entropy loss. We further establish MMUBench—the first dedicated unlearning benchmark for MLLMs—along with a comprehensive evaluation framework. Experiments demonstrate that SIU consistently outperforms existing methods on MMUBench, significantly enhancing robustness against membership inference and jailbreaking attacks, while preserving the model’s original capabilities in both language understanding and multimodal tasks.

Technology Category

Application Category

📝 Abstract

Machine unlearning empowers individuals with the `right to be forgotten' by removing their private or sensitive information encoded in machine learning models. However, it remains uncertain whether MU can be effectively applied to Multimodal Large Language Models (MLLMs), particularly in scenarios of forgetting the leaked visual data of concepts. To overcome the challenge, we propose an efficient method, Single Image Unlearning (SIU), to unlearn the visual recognition of a concept by fine-tuning a single associated image for few steps. SIU consists of two key aspects: (i) Constructing Multifaceted fine-tuning data. We introduce four targets, based on which we construct fine-tuning data for the concepts to be forgotten; (ii) Jointly training loss. To synchronously forget the visual recognition of concepts and preserve the utility of MLLMs, we fine-tune MLLMs through a novel Dual Masked KL-divergence Loss combined with Cross Entropy loss. Alongside our method, we establish MMUBench, a new benchmark for MU in MLLMs and introduce a collection of metrics for its evaluation. Experimental results on MMUBench show that SIU completely surpasses the performance of existing methods. Furthermore, we surprisingly find that SIU can avoid invasive membership inference attacks and jailbreak attacks. To the best of our knowledge, we are the first to explore MU in MLLMs. We will release the code and benchmark in the near future.

Problem

Research questions and friction points this paper is trying to address.

Efficiently removing visual concept recognition in MLLMs

Addressing privacy via single-image fine-tuning for unlearning

Preventing attacks while preserving model utility post-unlearning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Single Image Unlearning for MLLMs

Dual Masked KL-divergence Loss

Multifaceted fine-tuning data construction

🔎 Similar Papers

Cross-Modal Safety Alignment: Is textual unlearning all you need?