A Bayesian Non-parametric Approach to Generative Models: Integrating Variational Autoencoder and Generative Adversarial Networks using Wasserstein and Maximum Mean Discrepancy

📅 2023-08-27

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

225K/year

🤖 AI Summary

To address the inherent limitations of GANs—mode collapse—and VAEs—generating blurry or noisy samples—this paper proposes a Bayesian nonparametric (BNP) generative framework. Methodologically, it innovatively unifies VAE and GAN architectures via a dual-discriminator design that jointly optimizes the Wasserstein distance and maximum mean discrepancy (MMD), while introducing an auxiliary generator in the latent space to model infinite-dimensional latent distributions. Theoretically, this architecture mitigates overfitting and mode collapse; empirically, it enhances image fidelity, sample diversity, and support coverage of the target data distribution. Extensive experiments demonstrate that the proposed method significantly outperforms state-of-the-art baselines across image generation, anomaly detection, and data augmentation tasks—particularly under low-data regimes and long-tailed class distributions—where it exhibits superior robustness and generalization.

📝 Abstract

Generative models have emerged as a promising technique for producing high-quality images that are indistinguishable from real images. Generative adversarial networks (GANs) and variational autoencoders (VAEs) are two of the most prominent and widely studied generative models. GANs have demonstrated excellent performance in generating sharp realistic images and VAEs have shown strong abilities to generate diverse images. However, GANs suffer from ignoring a large portion of the possible output space which does not represent the full diversity of the target distribution, and VAEs tend to produce blurry images. To fully capitalize on the strengths of both models while mitigating their weaknesses, we employ a Bayesian non-parametric (BNP) approach to merge GANs and VAEs. Our procedure incorporates both Wasserstein and maximum mean discrepancy (MMD) measures in the loss function to enable effective learning of the latent space and generate diverse and high-quality samples. By fusing the discriminative power of GANs with the reconstruction capabilities of VAEs, our novel model achieves superior performance in various generative tasks, such as anomaly detection and data augmentation. Furthermore, we enhance the model's capability by employing an extra generator in the code space, which enables us to explore areas of the code space that the VAE might have overlooked. With a BNP perspective, we can model the data distribution using an infinite-dimensional space, which provides greater flexibility in the model and reduces the risk of overfitting. By utilizing this framework, we can enhance the performance of both GANs and VAEs to create a more robust generative model suitable for various applications.

Problem

Research questions and friction points this paper is trying to address.

Addresses overfitting in GANs and noisy samples in VAEs

Enhances training stability with Wasserstein and MMD measures

Combines GAN, VAE, and CGAN for diverse high-quality samples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian non-parametric framework enhances stability

Wasserstein and MMD loss for robust training

Triple model combines GAN, VAE, and CGAN

🔎 Similar Papers

A Markov Random Field Multi-Modal Variational AutoEncoder