Scalable Robust Bayesian Co-Clustering with Compositional ELBOs

📅 2025-04-05

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

To address noise sensitivity, poor robustness to missing values, and posterior collapse in deep co-clustering of high-dimensional sparse data, this paper proposes the first variational co-clustering framework that jointly learns cluster structures for both instances and features in the latent space. Methodologically: (1) a doubly reparameterized Compositional ELBO is designed; (2) a scale-modulation mechanism is introduced to mitigate posterior collapse; (3) a mutual information cross-loss is formulated to enforce consistency between row and column clustering; and (4) a reconstruction architecture integrating variational deep embedding, dual Gaussian mixture priors, and KL-divergence constraints enables end-to-end noise modeling. Evaluated on multimodal real-world datasets, the method achieves significant improvements in clustering accuracy and robustness—particularly under high dimensionality, noise corruption, or missing data—outperforming state-of-the-art co-clustering approaches.

Technology Category

Application Category

📝 Abstract

Co-clustering exploits the duality of instances and features to simultaneously uncover meaningful groups in both dimensions, often outperforming traditional clustering in high-dimensional or sparse data settings. Although recent deep learning approaches successfully integrate feature learning and cluster assignment, they remain susceptible to noise and can suffer from posterior collapse within standard autoencoders. In this paper, we present the first fully variational Co-clustering framework that directly learns row and column clusters in the latent space, leveraging a doubly reparameterized ELBO to improve gradient signal-to-noise separation. Our unsupervised model integrates a Variational Deep Embedding with a Gaussian Mixture Model (GMM) prior for both instances and features, providing a built-in clustering mechanism that naturally aligns latent modes with row and column clusters. Furthermore, our regularized end-to-end noise learning Compositional ELBO architecture jointly reconstructs the data while regularizing against noise through the KL divergence, thus gracefully handling corrupted or missing inputs in a single training pipeline. To counteract posterior collapse, we introduce a scale modification that increases the encoder's latent means only in the reconstruction pathway, preserving richer latent representations without inflating the KL term. Finally, a mutual information-based cross-loss ensures coherent co-clustering of rows and columns. Empirical results on diverse real-world datasets from multiple modalities, numerical, textual, and image-based, demonstrate that our method not only preserves the advantages of prior Co-clustering approaches but also exceeds them in accuracy and robustness, particularly in high-dimensional or noisy settings.

Problem

Research questions and friction points this paper is trying to address.

Develops robust co-clustering for high-dimensional sparse data

Addresses noise and posterior collapse in deep clustering

Enhances gradient separation with variational compositional ELBOs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Variational Co-clustering with doubly reparameterized ELBO

GMM prior for instances and features clustering

Regularized noise learning via Compositional ELBO

🔎 Similar Papers

No similar papers found.