Heterogeneous-Modal Unsupervised Domain Adaptation via Latent Space Bridging

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing unsupervised domain adaptation (UDA) methods fail when source and target domains are fully heterogeneous modalities (e.g., RGB and LiDAR). To address this, we propose a new setting: Heterogeneous-Modality Unsupervised Domain Adaptation (HMUDA). To enable cross-modal knowledge transfer, we design a Latent-Space Bridging (LSB) dual-branch framework: one branch learns modality-invariant features, while the other jointly optimizes cross-modal feature consistency and cross-domain class-center alignment via a bridging domain. We further incorporate unsupervised semantic segmentation modeling to avoid pseudo-label noise. Our method achieves state-of-the-art performance on six heterogeneous-modality benchmarks, significantly improving cross-modal transfer accuracy. To the best of our knowledge, this is the first work to systematically formulate and solve the HMUDA problem—addressing both its modeling challenges and alignment difficulties in a unified framework.

Technology Category

Application Category

📝 Abstract
Unsupervised domain adaptation (UDA) methods effectively bridge domain gaps but become struggled when the source and target domains belong to entirely distinct modalities. To address this limitation, we propose a novel setting called Heterogeneous-Modal Unsupervised Domain Adaptation (HMUDA), which enables knowledge transfer between completely different modalities by leveraging a bridge domain containing unlabeled samples from both modalities. To learn under the HMUDA setting, we propose Latent Space Bridging (LSB), a specialized framework designed for the semantic segmentation task. Specifically, LSB utilizes a dual-branch architecture, incorporating a feature consistency loss to align representations across modalities and a domain alignment loss to reduce discrepancies between class centroids across domains. Extensive experiments conducted on six benchmark datasets demonstrate that LSB achieves state-of-the-art performance.
Problem

Research questions and friction points this paper is trying to address.

Transfer knowledge between completely different modalities
Align representations across heterogeneous modalities
Reduce domain gaps for semantic segmentation tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-branch architecture for cross-modal learning
Feature consistency loss for representation alignment
Domain alignment loss for centroid discrepancy reduction
🔎 Similar Papers
No similar papers found.
Jiawen Yang
Jiawen Yang
Southern University of Science and Technology
Shuhao Chen
Shuhao Chen
HKUST, SUSTech
Transfer LearningLarge Language Model
Yucong Duan
Yucong Duan
SZ DJI Technology Co., Ltd
K
Ke Tang
Southern University of Science and Technology
Y
Yu Zhang
Southern University of Science and Technology