🤖 AI Summary
Existing benchmarks for unlearning in vision-language models struggle to simulate the fine-grained, personalized partial knowledge removal required in real-world scenarios. To address this gap, this work proposes PPU-Bench, the first real-world-oriented benchmark for personalized partial unlearning, comprising 24K multimodal and unimodal samples across three progressive unlearning settings: complete, selective, and personalized. The study further introduces Boundary-Aware Optimization (BAO), a method that explicitly models the boundary between knowledge to be forgotten and that to be retained within a subject. Experiments reveal that current approaches often erroneously suppress visual identity rather than factual knowledge, whereas BAO significantly enhances unlearning precision on two representative models, effectively preserving non-target facts, model utility, and cross-modal consistency while removing specific knowledge.
📝 Abstract
Multimodal Large Language Models (MLLMs) may memorize sensitive cross-modal information during pretraining. However, existing MLLM unlearning benchmarks rely on synthetic knowledge injection or complete subject-level deletion, which fail to capture realistic, personalized deletion requests that require fine-grained factual control. In this paper, we introduce PPU-Bench, a real-world and fine-tuning-free benchmark for personalized partial unlearning in MLLMs. PPU-Bench contains 24K multimodal and unimodal samples derived from pre-existing knowledge of 500 public figures under three progressively challenging settings: Complete, Selective, and Personalized unlearning. The benchmark evaluates whether methods can remove target knowledge while preserving non-target facts, model utility, and cross-modal consistency. Extensive experiments show that Complete Unlearning often suppresses visual identity rather than factual knowledge, while Selective and Personalized Unlearning expose significant forget--retain trade-offs and challenges in intra-subject factual boundaries. Robustness analysis under cross-image and prompt-based attacks reveals distinct vulnerabilities across different unlearning settings. Motivated by these findings, we propose Boundary-Aware Optimization (BAO), which explicitly models intra-subject forget-retain boundaries. Experimental results on two representative methods demonstrate that BAO can effectively enforce intra-subject factual boundaries.