Inference on Generalized Latent Variable Models with High-Dimensional Responses and Covariates

📅 2026-04-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

196K/year
🤖 AI Summary
This study addresses the challenge of regression with high-dimensional responses and covariates when the responses are influenced by both observed covariates and unobserved latent variables, a setting where conventional multivariate regression methods fail to provide effective modeling. The authors propose a generalized latent variable model that accommodates mixed-type high-dimensional responses and allows for flexible dependence structures between covariates and latent factors. By decomposing the non-convex estimation problem into a sequence of convex subproblems through alternating optimization, and integrating debiased estimation with asymptotic normality analysis, the work achieves, for the first time, valid statistical inference on covariate effects within a high-dimensional generalized latent variable framework. The proposed estimator enjoys statistical consistency and guaranteed error bounds, while the debiased version exhibits asymptotic normality, as demonstrated empirically in an application to PISA data for assessing educational equity.
📝 Abstract
Regression models with both high-dimensional responses and covariates have attracted growing attention. Standard multivariate regression models become inadequate when the response variables depend not only on observed covariates but also on latent variables that capture key unobserved characteristics. To draw statistical inferences on covariate effects while accounting for latent variables, we consider a high-dimensional generalized latent variable model that accommodates mixed-type responses and allows for flexible dependence between covariates and latent variables, which is more suitable for many real-world applications than existing methods that either rely on a linear regression form or restricted assumptions on the dependence between covariates and latent variables. We develop an alternating algorithm that iteratively updates the regression parameters and the latent variables, transforming an intractable nonconvex problem into a sequence of tractable convex subproblems. Theoretically, we provide algorithmic guarantees by establishing statistical consistency of the resulting estimator and deriving an error bound for it. Further, building on this estimator, we construct a debiased estimator for the covariate effect and establish its asymptotic normality. The effectiveness of the proposed method is demonstrated through an application to evaluating the fairness of the Programme for International Student Assessment (PISA).
Problem

Research questions and friction points this paper is trying to address.

high-dimensional responses
latent variables
covariate effects
generalized latent variable models
statistical inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

generalized latent variable model
high-dimensional inference
alternating algorithm
debiased estimator
asymptotic normality