SMART: Semantic Matching Contrastive Learning for Partially View-Aligned Clustering

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In partially aligned multi-view clustering (PVC), semantic matching of unaligned samples remains challenging, and cross-view representation distribution shifts lead to inaccurate correspondence estimation. To address these issues, this paper proposes a novel framework integrating semantic matching with contrastive learning. We introduce the first cross-view distribution correction mechanism to jointly model shared semantic structures for both aligned and unaligned samples. Additionally, we incorporate self-supervised feature disentanglement and multi-view consistency regularization to mitigate representation shifts caused by view heterogeneity. Evaluated on eight benchmark datasets, our method achieves an average 4.2% improvement in clustering accuracy over state-of-the-art approaches, demonstrating superior robustness—especially under low alignment ratios. Our core contributions are: (1) a semantic-matching-driven contrastive learning framework; (2) a cross-view distribution alignment mechanism; and (3) a unified semantic structure modeling paradigm tailored for partial alignment settings.

Technology Category

Application Category

📝 Abstract
Multi-view clustering has been empirically shown to improve learning performance by leveraging the inherent complementary information across multiple views of data. However, in real-world scenarios, collecting strictly aligned views is challenging, and learning from both aligned and unaligned data becomes a more practical solution. Partially View-aligned Clustering aims to learn correspondences between misaligned view samples to better exploit the potential consistency and complementarity across views, including both aligned and unaligned data. However, most existing PVC methods fail to leverage unaligned data to capture the shared semantics among samples from the same cluster. Moreover, the inherent heterogeneity of multi-view data induces distributional shifts in representations, leading to inaccuracies in establishing meaningful correspondences between cross-view latent features and, consequently, impairing learning effectiveness. To address these challenges, we propose a Semantic MAtching contRasTive learning model (SMART) for PVC. The main idea of our approach is to alleviate the influence of cross-view distributional shifts, thereby facilitating semantic matching contrastive learning to fully exploit semantic relationships in both aligned and unaligned data. Extensive experiments on eight benchmark datasets demonstrate that our method consistently outperforms existing approaches on the PVC problem.
Problem

Research questions and friction points this paper is trying to address.

Addresses partially view-aligned clustering with misaligned data
Mitigates cross-view distributional shifts for semantic matching
Exploits semantic relationships in both aligned and unaligned data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic matching contrastive learning for clustering
Alleviates cross-view distributional shifts
Exploits semantic relationships in aligned and unaligned data
🔎 Similar Papers
No similar papers found.
L
Liang Peng
Department of Computer Science, Shantou University
Yixuan Ye
Yixuan Ye
Data Scientist - Research, Google LLC
Statistical ModelingGenetic Prediction
C
Cheng Liu
College of Computer Science and Technology, Huaqiao University and the Department of Computer Science, Shantou University
H
Hangjun Che
Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China
F
Fei Wang
Department of Computer Science, Shantou University
Z
Zhiwen Yu
School of Computer Science and Engineering, South China University of Technology
S
Si Wu
School of Computer Science and Engineering, South China University of Technology
Hau-San Wong
Hau-San Wong
Professor, Department of Computer Science, City University of Hong Kong
Artificial IntelligenceMachine learningData miningBioinformatics