Object Hallucination-Free Reinforcement Unlearning for Vision-Language Models

๐Ÿ“… 2026-05-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

212K/year
๐Ÿค– AI Summary
Existing unlearning methods for vision-language models typically fine-tune only the language decoder, which proves insufficient for fully erasing visual representations and often leads to object hallucinations. To address this limitation, this work proposes HFRU, a novel framework that introduces, for the first time, a reinforcement learningโ€“driven deep semantic unlearning mechanism operating directly on the visual encoder. HFRU employs a two-stage strategy that combines alignment disruption with GRPO-based optimization and incorporates a composite reward function featuring abstract rewards to guide semantically coherent knowledge replacement. Experimental results demonstrate that HFRU achieves unlearning and retention performance exceeding 98% on both object recognition and facial identity tasks, nearly eliminates object hallucinations, and significantly outperforms current state-of-the-art methods.
๐Ÿ“ Abstract
Vision-language models (VLMs) raise growing concerns about privacy, copyright, and bias, motivating machine unlearning to remove sensitive knowledge. However, existing methods primarily fine-tune the language decoder, leading to superficial forgetting that fails to erase underlying visual representations and often introduces object hallucination. We propose HFRU, a reinforcement unlearning framework that operates on the vision encoder for deep semantic removal. Our two-stage approach combines alignment disruption with GRPO-based optimization using a composite reward, including an abstraction reward that encourages semantically valid substitutions and mitigates hallucinations. Experiments on object recognition and face identity tasks show that HFRU achieves over 98% forgetting and retention performance, while introducing negligible object hallucination, significantly outperforming prior methods.Our code and implementation details are available at https://github.com/XMUDeepLIT/HFRU.
Problem

Research questions and friction points this paper is trying to address.

object hallucination
machine unlearning
vision-language models
visual representation forgetting
Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement unlearning
vision-language models
object hallucination
machine unlearning
GRPO