DUDA: Distilled Unsupervised Domain Adaptation for Lightweight Semantic Segmentation

📅 2025-04-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Lightweight models suffer from low pseudo-label quality and significant performance degradation in unsupervised domain adaptation (UDA) semantic segmentation due to architectural rigidity. To address this, we propose EMA-KD, a synergistic framework integrating exponential moving average (EMA)-based self-training with knowledge distillation. Specifically, it introduces progressive large-to-small network distillation, an inconsistency-weighted loss to emphasize hard-to-adapt classes, and a multi-teacher ensemble for robust pseudo-label generation. By jointly leveraging EMA self-training and distillation, the framework enhances the robustness and generalization of lightweight models under domain shift. Evaluated on four standard UDA benchmarks, our approach achieves state-of-the-art performance using only lightweight architectures—outperforming several mainstream heavyweight models across multiple metrics. This work is the first to empirically demonstrate the strong competitiveness of lightweight models in UDA semantic segmentation.

Technology Category

Application Category

📝 Abstract
Unsupervised Domain Adaptation (UDA) is essential for enabling semantic segmentation in new domains without requiring costly pixel-wise annotations. State-of-the-art (SOTA) UDA methods primarily use self-training with architecturally identical teacher and student networks, relying on Exponential Moving Average (EMA) updates. However, these approaches face substantial performance degradation with lightweight models due to inherent architectural inflexibility leading to low-quality pseudo-labels. To address this, we propose Distilled Unsupervised Domain Adaptation (DUDA), a novel framework that combines EMA-based self-training with knowledge distillation (KD). Our method employs an auxiliary student network to bridge the architectural gap between heavyweight and lightweight models for EMA-based updates, resulting in improved pseudo-label quality. DUDA employs a strategic fusion of UDA and KD, incorporating innovative elements such as gradual distillation from large to small networks, inconsistency loss prioritizing poorly adapted classes, and learning with multiple teachers. Extensive experiments across four UDA benchmarks demonstrate DUDA's superiority in achieving SOTA performance with lightweight models, often surpassing the performance of heavyweight models from other approaches.
Problem

Research questions and friction points this paper is trying to address.

Enables lightweight semantic segmentation without costly annotations
Improves pseudo-label quality for lightweight models via distillation
Bridges architectural gap between heavyweight and lightweight models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines EMA-based self-training with knowledge distillation
Uses auxiliary student network to bridge architectural gap
Incorporates gradual distillation and inconsistency loss
🔎 Similar Papers
No similar papers found.