🤖 AI Summary
In real-world multi-view clustering (MVC), views often exhibit only partial sample alignment—violating the strong full-alignment assumption underlying conventional methods and degrading their performance.
Method: This paper proposes CauMVC, a causal-driven generalized MVC framework that models inter-view sample-order shifts as causal interventions. Leveraging a variational autoencoder, it jointly learns cross-view invariant features and sample association structures; contrastive regularization and post-intervention inference further enhance robustness.
Contribution/Results: CauMVC eliminates the need for strict view alignment, achieving state-of-the-art performance on both fully and partially aligned benchmarks. Extensive experiments validate its superior generalizability and effectiveness, offering a principled paradigm shift from idealized MVC assumptions toward practical deployment.
📝 Abstract
Multi-view clustering (MVC) aims to explore the common clustering structure across multiple views. Many existing MVC methods heavily rely on the assumption of view consistency, where alignments for corresponding samples across different views are ordered in advance. However, real-world scenarios often present a challenge as only partial data is consistently aligned across different views, restricting the overall clustering performance. In this work, we consider the model performance decreasing phenomenon caused by data order shift (i.e., from fully to partially aligned) as a generalized multi-view clustering problem. To tackle this problem, we design a causal multi-view clustering network, termed CauMVC. We adopt a causal modeling approach to understand multi-view clustering procedure. To be specific, we formulate the partially aligned data as an intervention and multi-view clustering with partially aligned data as an post-intervention inference. However, obtaining invariant features directly can be challenging. Thus, we design a Variational Auto-Encoder for causal learning by incorporating an encoder from existing information to estimate the invariant features. Moreover, a decoder is designed to perform the post-intervention inference. Lastly, we design a contrastive regularizer to capture sample correlations. To the best of our knowledge, this paper is the first work to deal generalized multi-view clustering via causal learning. Empirical experiments on both fully and partially aligned data illustrate the strong generalization and effectiveness of CauMVC.