PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation

📅 2024-02-06

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Existing evaluation methods for generative models often rely on assumptions about the true data density, require auxiliary pretrained models, or depend on handcrafted feature engineering. To address these limitations, we propose PQMass—a nonparametric, likelihood-free evaluation framework that makes no assumptions about the functional form of the underlying density and requires no external models or feature extraction. PQMass adaptively partitions the multidimensional sample space into bins, estimates probability mass distributions over these bins for both real and generated data, and quantifies their discrepancy via a multivariate chi-square test, yielding a statistically rigorous p-value. This is the first approach to unify spatial binning with chi-square testing for joint assessment of generative model quality, novelty, and diversity. Theoretically grounded and plug-and-play, PQMass demonstrates empirical effectiveness on multimodal and medium-to-high-dimensional datasets without dimensionality reduction, feature engineering, or prohibitive computational overhead.

Technology Category

Application Category

📝 Abstract

We propose a likelihood-free method for comparing two distributions given samples from each, with the goal of assessing the quality of generative models. The proposed approach, PQMass, provides a statistically rigorous method for assessing the performance of a single generative model or the comparison of multiple competing models. PQMass divides the sample space into non-overlapping regions and applies chi-squared tests to the number of data samples that fall within each region, giving a p-value that measures the probability that the bin counts derived from two sets of samples are drawn from the same multinomial distribution. PQMass does not depend on assumptions regarding the density of the true distribution, nor does it rely on training or fitting any auxiliary models. We evaluate PQMass on data of various modalities and dimensions, demonstrating its effectiveness in assessing the quality, novelty, and diversity of generated samples. We further show that PQMass scales well to moderately high-dimensional data and thus obviates the need for feature extraction in practical applications.

Problem

Research questions and friction points this paper is trying to address.

Assesses generative model quality without likelihood assumptions.

Compares multiple generative models using chi-squared tests.

Evaluates sample quality, novelty, and diversity effectively.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Likelihood-free method for distribution comparison

Chi-squared tests on non-overlapping sample regions

No assumptions on true distribution density

🔎 Similar Papers

2023-04-21Citations: 0