Heterogeneous-Modal Unsupervised Domain Adaptation via Latent Space Bridging

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Existing unsupervised domain adaptation (UDA) methods fail when source and target domains are fully heterogeneous modalities (e.g., RGB and LiDAR). To address this, we propose a new setting: Heterogeneous-Modality Unsupervised Domain Adaptation (HMUDA). To enable cross-modal knowledge transfer, we design a Latent-Space Bridging (LSB) dual-branch framework: one branch learns modality-invariant features, while the other jointly optimizes cross-modal feature consistency and cross-domain class-center alignment via a bridging domain. We further incorporate unsupervised semantic segmentation modeling to avoid pseudo-label noise. Our method achieves state-of-the-art performance on six heterogeneous-modality benchmarks, significantly improving cross-modal transfer accuracy. To the best of our knowledge, this is the first work to systematically formulate and solve the HMUDA problem—addressing both its modeling challenges and alignment difficulties in a unified framework.

Technology Category

Application Category

📝 Abstract

Unsupervised domain adaptation (UDA) methods effectively bridge domain gaps but become struggled when the source and target domains belong to entirely distinct modalities. To address this limitation, we propose a novel setting called Heterogeneous-Modal Unsupervised Domain Adaptation (HMUDA), which enables knowledge transfer between completely different modalities by leveraging a bridge domain containing unlabeled samples from both modalities. To learn under the HMUDA setting, we propose Latent Space Bridging (LSB), a specialized framework designed for the semantic segmentation task. Specifically, LSB utilizes a dual-branch architecture, incorporating a feature consistency loss to align representations across modalities and a domain alignment loss to reduce discrepancies between class centroids across domains. Extensive experiments conducted on six benchmark datasets demonstrate that LSB achieves state-of-the-art performance.

Problem

Research questions and friction points this paper is trying to address.

Transfer knowledge between completely different modalities

Align representations across heterogeneous modalities

Reduce domain gaps for semantic segmentation tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-branch architecture for cross-modal learning

Feature consistency loss for representation alignment

Domain alignment loss for centroid discrepancy reduction

🔎 Similar Papers

No similar papers found.