Masked Feature Modeling Enhances Adaptive Segmentation

📅 2025-09-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the underutilization of mask modeling in unsupervised domain adaptation (UDA) for semantic segmentation, this paper introduces Mask-Feature Modeling (MFM), the first auxiliary task operating directly in feature space. MFM incorporates a lightweight Rebuilder module that jointly optimizes the primary segmentation task and a feature reconstruction objective during training, preserving the original inference pipeline. The reconstruction target is explicitly aligned with the segmentation decoder to prevent task interference and inherently supports contrastive learning. As a plug-and-play strategy, MFM seamlessly integrates into mainstream frameworks—including DeepLab and DAFormer—without introducing any inference overhead. Extensive experiments on standard UDA benchmarks (e.g., GTA→Cityscapes) and diverse backbone architectures demonstrate consistent and significant performance gains, validating MFM’s generality, effectiveness, and computational efficiency.

Technology Category

Application Category

📝 Abstract
Unsupervised domain adaptation (UDA) for semantic segmentation aims to transfer models from a labeled source domain to an unlabeled target domain. While auxiliary self-supervised tasks-particularly contrastive learning-have improved feature discriminability, masked modeling approaches remain underexplored in this setting, largely due to architectural incompatibility and misaligned optimization objectives. We propose Masked Feature Modeling (MFM), a novel auxiliary task that performs feature masking and reconstruction directly in the feature space. Unlike existing masked modeling methods that reconstruct low-level inputs or perceptual features (e.g., HOG or visual tokens), MFM aligns its learning target with the main segmentation task, ensuring compatibility with standard architectures like DeepLab and DAFormer without modifying the inference pipeline. To facilitate effective reconstruction, we introduce a lightweight auxiliary module, Rebuilder, which is trained jointly but discarded during inference, adding zero computational overhead at test time. Crucially, MFM leverages the segmentation decoder to classify the reconstructed features, tightly coupling the auxiliary objective with the pixel-wise prediction task to avoid interference with the primary task. Extensive experiments across various architectures and UDA benchmarks demonstrate that MFM consistently enhances segmentation performance, offering a simple, efficient, and generalizable strategy for unsupervised domain-adaptive semantic segmentation.
Problem

Research questions and friction points this paper is trying to address.

Enhancing unsupervised domain adaptation for semantic segmentation
Addressing architectural incompatibility in masked modeling approaches
Improving feature discriminability without inference overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

MFM performs feature masking and reconstruction
Rebuilder module enables training without inference overhead
MFM aligns auxiliary objective with segmentation task
🔎 Similar Papers
No similar papers found.