🤖 AI Summary
Incomplete multi-view clustering (IMVC), existing methods face two key bottlenecks under high missing rates (≥80%): heavy reliance on strong paired supervision and insufficient generation diversity and discriminability.
Method: We propose an end-to-end diffusion-contrastive joint framework. It innovatively couples diffusion models with cross-view contrastive learning to enable view reconstruction under arbitrary missing patterns—without requiring paired samples. We further introduce dual-granularity (instance-level and class-level) interactive representation learning to jointly optimize generative fidelity and clustering performance. Forward noise perturbation and reverse denoising, multi-view consistency constraints, and unified optimization collectively enhance both the diversity and separability of generated views.
Contribution/Results: Our method demonstrates robust clustering performance under extreme missingness (≥80%) across multiple benchmark datasets, consistently surpassing state-of-the-art approaches in clustering accuracy.
📝 Abstract
Incomplete multi-view clustering (IMVC) has garnered increasing attention in recent years due to the common issue of missing data in multi-view datasets. The primary approach to address this challenge involves recovering the missing views before applying conventional multi-view clustering methods. Although imputation-based IMVC methods have achieved significant improvements, they still encounter notable limitations: 1) heavy reliance on paired data for training the data recovery module, which is impractical in real scenarios with high missing data rates; 2) the generated data often lacks diversity and discriminability, resulting in suboptimal clustering results. To address these shortcomings, we propose a novel IMVC method called Diffusion Contrastive Generation (DCG). Motivated by the consistency between the diffusion and clustering processes, DCG learns the distribution characteristics to enhance clustering by applying forward diffusion and reverse denoising processes to intra-view data. By performing contrastive learning on a limited set of paired multi-view samples, DCG can align the generated views with the real views, facilitating accurate recovery of views across arbitrary missing view scenarios. Additionally, DCG integrates instance-level and category-level interactive learning to exploit the consistent and complementary information available in multi-view data, achieving robust and end-to-end clustering. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches.