Generative Model via Quantile Assignment

📅 2026-02-20

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work proposes NeuroSQL, a novel deep generative framework that eliminates the need for auxiliary networks such as encoders or discriminators, thereby addressing common issues of training instability, high computational cost, and mode collapse in conventional models. By leveraging the linear assignment problem from optimal transport theory and introducing a quantile assignment mechanism, NeuroSQL implicitly constructs low-dimensional latent representations, which are then fed into an independent generator. The approach avoids explicit encoding and adversarial training altogether. Experiments on MNIST, CelebA, AFHQ, and OASIS demonstrate that NeuroSQL achieves superior image generation quality compared to VAEs, GANs, and diffusion models, while offering the fastest training speed. Notably, it maintains high performance even in few-shot settings, significantly enhancing stability, efficiency, and information preservation.

Technology Category

Application Category

📝 Abstract

Deep Generative models (DGMs) play two key roles in modern machine learning: (i) producing new information (e.g., image synthesis) and (ii) reducing dimensionality. However, traditional architectures often rely on auxiliary networks such as encoders in Variational Autoencoders (VAEs) or discriminators in Generative Adversarial Networks (GANs), which introduce training instability, computational overhead, and risks like mode collapse. We present NeuroSQL, a new generative paradigm that eliminates the need for auxiliary networks by learning low-dimensional latent representations implicitly. NeuroSQL leverages an asymptotic approximation that expresses the latent variables as the solution to an optimal transportation problem. Specifically, NeuroSQL learns the latent variables by solving a linear assignment problem and then passes the latent information to a standalone generator. We benchmark its performance against GANs, VAEs, and a budget-matched diffusion baseline on four datasets: handwritten digits (MNIST), faces (CelebA), animal faces (AFHQ), and brain images (OASIS). Compared to VAEs, GANs, and diffusion models: (1) in terms of image quality, NeuroSQL achieves overall lower mean pixel distance between synthetic and authentic images and stronger perceptual/structural fidelity; (2) computationally, NeuroSQL requires the least training time; and (3) practically, NeuroSQL provides an effective solution for generating synthetic data with limited training samples. By embracing quantile assignment rather than an encoder, NeuroSQL provides a fast, stable, and robust way to generate synthetic data with minimal information loss.

Problem

Research questions and friction points this paper is trying to address.

Deep Generative Models

Auxiliary Networks

Training Instability

Mode Collapse

Computational Overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantile Assignment

Optimal Transport

Auxiliary-Free Generative Model