A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

📅 2024-10-02
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses conditional distribution regression when the response variable lies on a low-dimensional manifold embedded in a high-dimensional input space. Methodologically, it proposes a likelihood-based framework for estimating conditional deep generative models, employing a sieve maximum likelihood estimator augmented with manifold perturbation regularization to ensure training stability. Theoretical analysis establishes convergence rates under both the Hellinger and Wasserstein metrics. The key contribution is the first derivation of minimax-optimal convergence rates depending solely on the intrinsic dimension of the underlying manifold and the smoothness of the target conditional density—thereby rigorously characterizing how the model circumvents the “curse of dimensionality.” This statistical mechanism is especially advantageous for nearly singular conditional distributions. Empirical validation on synthetic and real-world datasets confirms both the method’s effectiveness and its alignment with theoretical predictions.

Technology Category

Application Category

📝 Abstract
In this work, we explore the theoretical properties of conditional deep generative models under the statistical framework of distribution regression where the response variable lies in a high-dimensional ambient space but concentrates around a potentially lower-dimensional manifold. More specifically, we study the large-sample properties of a likelihood-based approach for estimating these models. Our results lead to the convergence rate of a sieve maximum likelihood estimator (MLE) for estimating the conditional distribution (and its devolved counterpart) of the response given predictors in the Hellinger (Wasserstein) metric. Our rates depend solely on the intrinsic dimension and smoothness of the true conditional distribution. These findings provide an explanation of why conditional deep generative models can circumvent the curse of dimensionality from the perspective of statistical foundations and demonstrate that they can learn a broader class of nearly singular conditional distributions. Our analysis also emphasizes the importance of introducing a small noise perturbation to the data when they are supported sufficiently close to a manifold. Finally, in our numerical studies, we demonstrate the effective implementation of the proposed approach using both synthetic and real-world datasets, which also provide complementary validation to our theoretical findings.
Problem

Research questions and friction points this paper is trying to address.

Estimating conditional distributions using deep generative models in regression
Analyzing convergence rates of sieve MLE for high-dimensional manifold data
Overcoming dimensionality curse by learning nearly singular distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses conditional deep generative models
Employs sieve maximum likelihood estimation
Introduces small noise perturbation technique
🔎 Similar Papers
No similar papers found.
S
Shivam Kumar
Department of ACMS, University of Notre Dame
Y
Yun Yang
Department of Mathematics, University of Maryland College Park
Lizhen Lin
Lizhen Lin
Department of Mathematics, The University of Maryland
Geometry & StatisticsBayesian TheoryStatistics Theory of Deep LearningGeometric Deep Learning