Bayesian Semiparametric Multivariate Density Regression with Coordinate-Wise Predictor Selection

πŸ“… 2026-04-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the challenge of modeling joint densities of multivariate responses in the presence of categorical covariates and identifying their effects. The authors propose a Bayesian semiparametric approach based on Gaussian copulas, wherein conditional marginal distributions are represented through a mixture model with shared atoms. To allow flexible covariate-dependent variation in mixture weights, they employ Tucker tensor decomposition, coupled with a coordinate-specific random partition model that adaptively aggregates covariate levels exhibiting similar effects. Embedded within this framework is a coordinate-level predictor selection mechanism that achieves effective dimension reduction without compromising modeling flexibility. Extensive simulations and an analysis of NHANES dietary data demonstrate that the method is computationally efficient, memory-friendly, and delivers superior performance in both estimation accuracy and predictive capability.
πŸ“ Abstract
We propose a flexible Bayesian approach for estimating the joint density of a multivariate outcome of interest in the presence of categorical covariates. Leveraging a Gaussian copula framework, our method effectively captures the dependence structure across different coordinates of the multivariate response. The conditional (on covariates) marginal (across outcomes) distributions are modeled as flexible mixtures with shared atoms across coordinates, while the mixture weights are allowed to vary with covariates through a novel Tucker tensor factorization-based structure, which enables the identification of coordinate-specific subsets of influential covariates. In particular, we replace the traditional mode matrices with coordinate-specific random partition models on the covariate levels, offering a flexible mechanism to aggregate covariate levels that exhibit similar effects on the response. Additionally, to handle settings with many covariates, we introduce a Markov chain Monte Carlo algorithm that scales with the number of aggregated levels rather than the original levels, significantly reducing memory requirements and improving computational efficiency. We demonstrate the method's numerical performance through simulation experiments and its practical applicability through the analysis of NHANES dietary data.
Problem

Research questions and friction points this paper is trying to address.

multivariate density regression
coordinate-wise predictor selection
categorical covariates
Bayesian semiparametric modeling
Gaussian copula
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian semiparametric
Gaussian copula
Tucker tensor factorization
coordinate-wise predictor selection
random partition model
πŸ”Ž Similar Papers
No similar papers found.
G
Giovanni Toto
Department of Statistics and Data Sciences, The University of Texas at Austin
P
Peter MΓΌller
Department of Statistics and Data Sciences, The University of Texas at Austin; Department of Mathematics, The University of Texas at Austin
Abhra Sarkar
Abhra Sarkar
The University of Texas at Austin
Bayesian MethodsMeasurement Error Models(Hidden) Markov Models