🤖 AI Summary
Existing WGAN theory is restricted to the linear-quadratic-Gaussian (LQG) setting, limiting its applicability to non-Gaussian data and nonlinear neural generators.
Method: We propose a novel sliced WGAN framework based on joint distribution constraints, incorporating random projections and nonlinear generator modeling; we rigorously establish the asymptotic optimality of linear generators in high dimensions.
Contribution/Results: First, we derive closed-form optimal solutions for both classical and sliced Wasserstein GANs in one dimension under non-Gaussian data and nonlinear activation—breaking the LQG barrier. Second, we characterize the analytical structure of optimal parameters for low-dimensional nonlinear generators. Third, we establish the first theoretical analysis paradigm for sliced Wasserstein distances applicable to non-Gaussian, nonlinear settings. Experiments confirm that the derived closed-form parameters yield stable convergence on both Gaussian and Laplacian data, achieving performance comparable to r-PCA at lower computational cost.
📝 Abstract
The generative adversarial network (GAN) aims to approximate an unknown distribution via a parameterized neural network (NN). While GANs have been widely applied in reinforcement and semisupervised learning as well as computer vision tasks, selecting their parameters often needs an exhaustive search and only a few selection methods can be proved to be theoretically optimal. One of the most promising GAN variants is the Wasserstein GAN (WGAN). Prior work on optimal parameters for WGAN is limited to the linear-quadratic-Gaussian (LQG) setting, where the NN is linear and the data is Gaussian. In this paper, we focus on the characterization of optimal WGAN parameters beyond the LQG setting. We derive closed-form optimal parameters for one-dimensional WGANs when the NN has non-linear activation functions and the data is non-Gaussian. To extend this to high-dimensional WGANs, we adopt the sliced Wasserstein framework and replace the constraint on marginal distributions of the randomly projected data by a constraint on the joint distribution of the original (unprojected) data. We show that the linear generator can be asymptotically optimal for sliced WGAN with non-Gaussian data. Empirical studies show that our closed-form WGAN parameters have good convergence behavior with data under both Gaussian and Laplace distributions. Also, compared to the r principal component analysis (r-PCA) solution, our proposed solution for sliced WGAN can achieve the same performance while requiring less computational resources.