Max-Sliced Wasserstein Distance and Its Use for GANs

πŸ“… 2019-04-11
πŸ›οΈ Computer Vision and Pattern Recognition
πŸ“ˆ Citations: 212
✨ Influential: 34
πŸ“„ PDF
πŸ€– AI Summary
In GAN training, the sliced Wasserstein distance (SWD) relies on numerous random projections, leading to high computational cost and slow convergence for high-resolution images (e.g., 256Γ—256). To address this, we propose the max-sliced Wasserstein distance (Max-SWD), which replaces multiple random one-dimensional projections with a single optimal projection direction. Theoretically, Max-SWD preserves SWD’s favorable sample complexity; computationally, the optimal direction is efficiently obtained via singular value decomposition (SVD) or gradient ascent. This work is the first to incorporate explicit directional optimization into the definition of a Wasserstein-type distance. Integrated into DCGAN/WGAN frameworks, Max-SWD substantially reduces projection-related computation. Experiments on CIFAR-10 and CelebA demonstrate improved training stability, faster convergence (fewer iterations), and consistently lower FrΓ©chet Inception Distance (FID) compared to standard SWD baselines.
πŸ“ Abstract
Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities, showing promise for dataset augmentation, image-to-image translation and feature learning. However, to model high-dimensional distributions, sequential training and stacked architectures are common, increasing the number of tunable hyper-parameters as well as the training time. Nonetheless, the sample complexity of the distance metrics remains one of the factors affecting GAN training. We first show that the recently proposed sliced Wasserstein distance has compelling sample complexity properties when compared to the Wasserstein distance. To further improve the sliced Wasserstein distance we then analyze its `projection complexity' and develop the max-sliced Wasserstein distance which enjoys compelling sample complexity while reducing projection complexity, albeit necessitating a max estimation. We finally illustrate that the proposed distance trains GANs on high-dimensional images up to a resolution of 256x256 easily.
Problem

Research questions and friction points this paper is trying to address.

Improving sample complexity of distance metrics for GAN training
Reducing projection complexity in sliced Wasserstein distance
Enabling stable GAN training on high-dimensional image datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed max-sliced Wasserstein distance metric
Reduced projection complexity for distribution modeling
Enabled GAN training on high-resolution 256x256 images
πŸ”Ž Similar Papers
No similar papers found.
I
Ishani Deshpande
University of Illinois at Urbana-Champaign
Yuan-Ting Hu
Yuan-Ting Hu
Research Scientist, FAIR, Meta AI
computer visionmachine learning
R
Ruoyu Sun
University of Illinois at Urbana-Champaign
Ayis Pyrros
Ayis Pyrros
Neuroradiology, DuPage Medical Group
Radiologymachine learning
N
Nasir Siddiqui
University of Illinois at Urbana-Champaign
O
Oluwasanmi Koyejo
University of Illinois at Urbana-Champaign
Z
Zhizhen Zhao
University of Illinois at Urbana-Champaign
D
D. Forsyth
University of Illinois at Urbana-Champaign
A
A. Schwing
University of Illinois at Urbana-Champaign