Erasing CLIP Memories: Non-Destructive, Data-Free Zero-Shot class Unlearning in CLIP Models

📅 2025-12-16
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
This work addresses the problem of selective knowledge erasure in multimodal models (e.g., CLIP) without retraining and without access to images of the classes to be forgotten. We propose a zero-shot, data-free, non-destructive class forgetting method. Our approach constructs an orthonormal basis for the text embedding subspace and applies a closed-form nullspace projection at the final projection layer to attenuate cross-modal alignment for target classes. To our knowledge, this is the first method enabling controllable and interpretable class-level forgetting in large multimodal models: zero-shot accuracy on target classes drops significantly, while performance on non-target classes remains nearly intact. Crucially, partial projection suffices to balance forgetting strength and knowledge retention. Our method establishes a new paradigm for efficient, lightweight, and training-free model sanitization—advancing safe governance of foundation models.

Technology Category

Application Category

📝 Abstract
We introduce a novel, closed-form approach for selective unlearning in multimodal models, specifically targeting pretrained models such as CLIP. Our method leverages nullspace projection to erase the target class information embedded in the final projection layer, without requiring any retraining or the use of images from the forget set. By computing an orthonormal basis for the subspace spanned by target text embeddings and projecting these directions, we dramatically reduce the alignment between image features and undesired classes. Unlike traditional unlearning techniques that rely on iterative fine-tuning and extensive data curation, our approach is both computationally efficient and surgically precise. This leads to a pronounced drop in zero-shot performance for the target classes while preserving the overall multimodal knowledge of the model. Our experiments demonstrate that even a partial projection can balance between complete unlearning and retaining useful information, addressing key challenges in model decontamination and privacy preservation.
Problem

Research questions and friction points this paper is trying to address.

Erases target class information in CLIP models
Uses nullspace projection without retraining or data
Balances unlearning with preserving overall model knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nullspace projection erases target class information
Orthonormal basis reduces image-text alignment for classes
Partial projection balances unlearning and knowledge retention
🔎 Similar Papers
2024-05-21Neural Information Processing SystemsCitations: 11