SineProject: Machine Unlearning for Stable Vision Language Alignment

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

To address the problem that multimodal large language models (MLLMs) erroneously reject benign queries during machine unlearning due to degradation of vision–language alignment, this paper proposes SineProject: a method that enhances the frozen visual projector via sinusoidal-modulated trainable parameters. Without updating the backbone model, SineProject explicitly optimizes the spectral condition number of the projector’s Jacobian matrix—marking the first approach to directly control projector ill-conditioning. The method jointly preserves unlearning accuracy and cross-modal alignment stability. On LLaVA-7B and LLaVA-13B, it achieves complete erasure of target knowledge while substantially reducing benign query rejection rates. It establishes the state-of-the-art trade-off between forgetting efficacy and retention fidelity, with negligible computational overhead.

Technology Category

Application Category

📝 Abstract

Multimodal Large Language Models (MLLMs) increasingly need to forget specific knowledge such as unsafe or private information without requiring full retraining. However, existing unlearning methods often disrupt vision language alignment, causing models to reject both harmful and benign queries. We trace this failure to the projector network during unlearning, its Jacobian becomes severely illconditioned, leading to unstable optimization and drift in cross modal embeddings. We introduce SineProject, a simple method that augments the frozen projector with sinusoidally modulated trainable parameters, improving the Jacobian's spectral conditioning and stabilizing alignment throughout unlearning. Across standard safety and privacy unlearning benchmarks using LLaVA v1.5 7B and 13B, SineProject reduces benign query refusals while achieving complete forgetting of targeted information, yielding state of the art forget retain trade offs with negligible computational overhead.

Problem

Research questions and friction points this paper is trying to address.

Multimodal models need selective forgetting without full retraining

Existing unlearning disrupts vision-language alignment in MLLMs

Projector network instability causes harmful and benign query rejection

Innovation

Methods, ideas, or system contributions that make the work stand out.

SineProject augments frozen projector with sinusoids

Modulated parameters improve Jacobian spectral conditioning

Stabilizes vision-language alignment during unlearning process

🔎 Similar Papers

Cross-Modal Safety Alignment: Is textual unlearning all you need?