🤖 AI Summary
This work addresses the challenge of machine unlearning in continually deployed vision-language models, where effective removal of targeted knowledge, preservation of model utility, and prevention of forgotten knowledge reemergence are critical yet underexplored. It presents the first systematic study of continual machine unlearning in this setting and proposes a conflict-avoidant task arithmetic approach. Specifically, unlearning requests are modeled as task vectors, which are integrated with historical task vectors via a sign-aware, conflict-avoidant aggregation mechanism designed to suppress updates that would undermine prior unlearning efforts. Experiments demonstrate that the proposed method significantly outperforms existing baselines in both single-step and continual unlearning scenarios, achieving consistent improvements across three key metrics: unlearning effectiveness, model fidelity, and unlearning permanence.
📝 Abstract
Vision-language models (VLMs) have shown remarkable ability in aligning visual and textual representations, enabling a wide range of multimodal applications. However, their large-scale training data inevitably raises concerns about privacy, copyright, and undesirable content, creating a strong need for machine unlearning. While existing studies mainly focus on single-shot unlearning, practical VLM deployment often involves sequential removal requests over time, giving rise to continual machine unlearning. In this work, we make the first attempt to study continual unlearning for VLMs and identify three key challenges in this setting: effectiveness in removing target knowledge, fidelity in preserving retained model utility, and persistence in preventing knowledge re-emergence under sequential updates. To address these challenges, we propose CATA, a conflict-averse task arithmetic method that represents each forget request as an unlearning task vector. By maintaining historical task vectors and performing sign-aware conflict-averse aggregation, CATA suppresses conflicting update components that may weaken previous forgetting effects. Extensive experiments under both single-shot and continual settings show that CATA outperforms baselines in terms of forgetting effectiveness, model fidelity, and forgetting persistence.