OFFSIDE: Benchmarking Unlearning Misinformation in Multimodal Large Language Models

📅 2025-10-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing multimodal large language model (MLLM) machine unlearning (MU) benchmarks suffer from low image diversity, inaccurate annotations, and narrow evaluation scenarios, failing to reflect the complexity of multimodal false information unlearning in real-world applications. To address this, we propose OFFSIDE—the first MLLM unlearning benchmark specifically designed for football transfer rumors—comprising 15.68K human-annotated samples and four distinct test sets that support selective unlearning, text-only unlearning, and corrective relearning. Through systematic evaluation of unlearning efficacy, generalization, utility preservation, and robustness, we identify five critical challenges, including the persistence of visual misinformation and susceptibility to prompt-based recovery attacks. Empirical results reveal that current methods heavily rely on catastrophic forgetting for multimodal unlearning and exhibit insufficient robustness. OFFSIDE establishes a new benchmark and delivers key insights for developing trustworthy MLLMs.

Technology Category

Application Category

📝 Abstract
Advances in Multimodal Large Language Models (MLLMs) intensify concerns about data privacy, making Machine Unlearning (MU), the selective removal of learned information, a critical necessity. However, existing MU benchmarks for MLLMs are limited by a lack of image diversity, potential inaccuracies, and insufficient evaluation scenarios, which fail to capture the complexity of real-world applications. To facilitate the development of MLLMs unlearning and alleviate the aforementioned limitations, we introduce OFFSIDE, a novel benchmark for evaluating misinformation unlearning in MLLMs based on football transfer rumors. This manually curated dataset contains 15.68K records for 80 players, providing a comprehensive framework with four test sets to assess forgetting efficacy, generalization, utility, and robustness. OFFSIDE supports advanced settings like selective unlearning and corrective relearning, and crucially, unimodal unlearning (forgetting only text data). Our extensive evaluation of multiple baselines reveals key findings: (1) Unimodal methods (erasing text-based knowledge) fail on multimodal rumors; (2) Unlearning efficacy is largely driven by catastrophic forgetting; (3) All methods struggle with "visual rumors" (rumors appear in the image); (4) The unlearned rumors can be easily recovered and (5) All methods are vulnerable to prompt attacks. These results expose significant vulnerabilities in current approaches, highlighting the need for more robust multimodal unlearning solutions. The code is available at href{https://github.com/zh121800/OFFSIDE}{https://github.com/zh121800/OFFSIDE}.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking misinformation unlearning in multimodal language models
Addressing limitations in image diversity and evaluation scenarios
Evaluating forgetting efficacy and robustness against prompt attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces OFFSIDE benchmark for misinformation unlearning
Manually curated dataset with 15.68K football transfer records
Supports selective unlearning and corrective relearning settings
🔎 Similar Papers
No similar papers found.
H
Hao Zheng
Harbin Institute of Technology
Zirui Pang
Zirui Pang
University of Illinois Urbana-Champaign
Machine LearningUnlearningLabel Noise
L
Ling li
The Hong Kong University of Science and Technology (Guangzhou)
Z
Zhijie Deng
The Hong Kong University of Science and Technology (Guangzhou)
Y
Yuhan Pu
The Hong Kong University of Science and Technology (Guangzhou)
Zhaowei Zhu
Zhaowei Zhu
Docta.ai; University of California, Santa Cruz
Machine learningData QualityLabel NoiseResponsible AI
Xiaobo Xia
Xiaobo Xia
Postdoc, National University of Singapore
Data-Centric AITrustworthy AIMachine LearningMultimodal LearningAI4Science
J
Jiaheng Wei
The Hong Kong University of Science and Technology (Guangzhou)