Suppressing Non-Semantic Noise in Masked Image Modeling Representations

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the issue that representations learned through Masked Image Modeling (MIM) often incorporate non-semantic noise, which degrades performance on downstream tasks. To mitigate this without requiring model retraining, the authors propose SOAP—a model-agnostic, plug-and-play post-processing method. SOAP leverages Principal Component Analysis (PCA) to construct a semantic invariance score and applies orthogonal projection to linearly transform patch-level representations, thereby enabling the first quantitative identification and removal of non-semantic components in MIM features. Evaluated across diverse MIM architectures, SOAP consistently enhances zero-shot inference performance, demonstrating both its effectiveness and broad applicability.
📝 Abstract
Masked Image Modeling (MIM) has become a ubiquitous self-supervised vision paradigm. In this work, we show that MIM objectives cause the learned representations to retain non-semantic information, which ultimately hurts performance during inference. We introduce a model-agnostic score for semantic invariance using Principal Component Analysis (PCA) on real and synthetic non-semantic images. Based on this score, we propose a simple method, Semantically Orthogonal Artifact Projection (SOAP), to directly suppress non-semantic information in patch representations, leading to consistent improvements in zero-shot performance across various MIM-based models. SOAP is a post-hoc suppression method, requires zero training, and can be attached to any model as a single linear head.
Problem

Research questions and friction points this paper is trying to address.

Masked Image Modeling
non-semantic noise
representation learning
semantic invariance
self-supervised learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked Image Modeling
Semantic Invariance
Non-Semantic Noise
PCA
SOAP
🔎 Similar Papers
No similar papers found.