🤖 AI Summary
To address the challenge of jointly optimizing clustering performance and generative capability in deep clustering, this paper proposes VAE-EM: a deep generative clustering framework that natively integrates a variational autoencoder (VAE) into the expectation-maximization (EM) paradigm. Unlike conventional approaches relying on Gaussian mixture model (GMM) priors or auxiliary regularization terms, VAE-EM directly models each cluster in the latent space as a learnable probability distribution. It jointly optimizes clustering structure learning and cluster-conditional sample generation via alternating maximization of the evidence lower bound (ELBO) and cluster posterior estimation. On MNIST and FashionMNIST, VAE-EM achieves significantly higher clustering accuracy than current state-of-the-art methods. Moreover, it enables semantic-consistent, cluster-specific generation of novel samples—marking the first end-to-end framework that unifies hard clustering objectives with explicit generative modeling within a single coherent optimization procedure.
📝 Abstract
We propose a novel deep clustering method that integrates Variational Autoencoders (VAEs) into the Expectation-Maximization (EM) framework. Our approach models the probability distribution of each cluster with a VAE and alternates between updating model parameters by maximizing the Evidence Lower Bound (ELBO) of the log-likelihood and refining cluster assignments based on the learned distributions. This enables effective clustering and generation of new samples from each cluster. Unlike existing VAE-based methods, our approach eliminates the need for a Gaussian Mixture Model (GMM) prior or additional regularization techniques. Experiments on MNIST and FashionMNIST demonstrate superior clustering performance compared to state-of-the-art methods.