π€ AI Summary
To address severe and imbalanced missingness across views in incomplete multi-view clustering, this paper proposes an information-driven selective imputation framework. Unlike conventional approaches that either blindly impute all missing entries or discard incomplete views entirely, our method introduces a novel mechanism for quantifying the informativeness of missing positions by jointly leveraging intra-view similarity and inter-view consistency; imputation is performed only where sufficient information exists, at the distributional level, with explicit modeling of imputation uncertainty. Furthermore, we employ a variational autoencoder equipped with a Gaussian mixture prior to learn clustering-friendly latent representations and achieve uncertainty-aware latent-space fusion. The framework is lightweight, model-agnostic, and plug-and-play. Extensive experiments on multiple real-world benchmarks featuring imbalanced missingness demonstrate that our method significantly outperforms both state-of-the-art imputation-based and imputation-free methods, achieving simultaneous improvements in clustering accuracy and robustness.
π Abstract
Incomplete multi-view data, where different views suffer from missing and unbalanced observations, pose significant challenges for clustering. Existing imputation-based methods attempt to estimate missing views to restore data associations, but indiscriminate imputation often introduces noise and bias, especially when the available information is insufficient. Imputation-free methods avoid this risk by relying solely on observed data, but struggle under severe incompleteness due to the lack of cross-view complementarity. To address this issue, we propose Informativeness-based Selective imputation Multi-View Clustering (ISMVC). Our method evaluates the imputation-relevant informativeness of each missing position based on intra-view similarity and cross-view consistency, and selectively imputes only when sufficient support is available. Furthermore, we integrate this selection with a variational autoencoder equipped with a mixture-of-Gaussians prior to learn clustering-friendly latent representations. By performing distribution-level imputation, ISMVC not only stabilizes the aggregation of posterior distributions but also explicitly models imputation uncertainty, enabling robust fusion and preventing overconfident reconstructions. Compared with existing cautious imputation strategies that depend on training dynamics or model feedback, our method is lightweight, data-driven, and model-agnostic. It can be readily integrated into existing IMC models as a plug-in module. Extensive experiments on multiple benchmark datasets under a more realistic and challenging unbalanced missing scenario demonstrate that our method outperforms both imputation-based and imputation-free approaches.