Which Concepts to Forget and How to Refuse? Decomposing Concepts for Continual Unlearning in Large Vision-Language Models

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

This work addresses the challenge that large vision-language models struggle to precisely reject specific image-instruction pairs during continual unlearning, often misrejecting due to representation entanglement across shared features. To mitigate this, the authors propose a concept-decomposition-based continual unlearning framework that decouples deletion targets into fine-grained visual and textual concepts. A concept modulator identifies the combinations of concepts to be unlearned, while a dedicated rejection expert generates semantically aligned refusal responses. Integrated with a multimodal concept routing mechanism, the approach enables concept-level unlearning across tasks. Experiments on vision-language benchmarks demonstrate that the method significantly outperforms existing techniques, achieving precise, reusable, and concept-aligned refusals while effectively preserving the model’s general capabilities throughout the continual unlearning process.

Technology Category

Application Category

📝 Abstract

Continual unlearning poses the challenge of enabling large vision-language models to selectively refuse specific image-instruction pairs in response to sequential deletion requests, while preserving general utility. However, sequential unlearning updates distort shared representations, creating spurious associations between vision-language pairs and refusal behaviors that hinder precise identification of refusal targets, resulting in inappropriate refusals. To address this challenge, we propose a novel continual unlearning framework that grounds refusal behavior in fine-grained descriptions of visual and textual concepts decomposed from deletion targets. We first identify which visual-linguistic concept combinations characterize each forget category through a concept modulator, then determine how to generate appropriate refusal responses via a mixture of refusal experts, termed refusers, each specialized for concept-aligned refusal generation. To generate concept-specific refusal responses across sequential tasks, we introduce a multimodal, concept-driven routing scheme that reuses refusers for tasks sharing similar concepts and adapts underutilized ones for novel concepts. Extensive experiments on vision-language benchmarks demonstrate that the proposed framework outperforms existing methods by generating concept-grounded refusal responses and preserving the general utility across unlearning sequences.

Problem

Research questions and friction points this paper is trying to address.

continual unlearning

vision-language models

selective refusal

concept decomposition

shared representation distortion

Innovation

Methods, ideas, or system contributions that make the work stand out.

continual unlearning

concept decomposition

refusal generation