G-HIVE: Parameter Estimation and Approximate Inference for Multivariate Response Generalized Linear Models with Hidden Variables

📅 2025-08-29

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This paper addresses multi-response generalized linear models (GLMs) with unobserved latent variables that simultaneously affect both responses and covariates. To tackle the resulting parameter non-identifiability, estimation bias, and invalid inference—longstanding challenges unaddressed in nonlinear multi-response settings—we propose G-HIVE, a unified estimation and inference framework. Methodologically, G-HIVE introduces an orthogonal projection-based weighting scheme to eliminate first-order asymptotic bias; integrates a factor model with corrected quasi-likelihood to jointly achieve bias correction and uncertainty quantification; and establishes a Berry–Esseen-type Gaussian approximation theory to rigorously characterize convergence rates and distributional accuracy of estimators. Theoretically, G-HIVE guarantees consistency and asymptotic normality. Empirically, it substantially outperforms baseline methods that ignore latent confounding, as demonstrated on both synthetic and real-world datasets.

Technology Category

Application Category

📝 Abstract

In practice, there often exist unobserved variables, also termed hidden variables, associated with both the response and covariates. Existing works in the literature mostly focus on linear regression with hidden variables. However, when the regression model is non-linear, the presence of hidden variables leads to new challenges in parameter identification, estimation, and statistical inference. This paper studies multivariate response generalized linear models (GLMs) with hidden variables. We propose a unified framework for parameter estimation and statistical inference called G-HIVE, short for 'G'eneralized - 'HI'dden 'V'ariable adjusted 'E'stimation. Specifically, based on factor model assumptions, we propose a modified quasi-likelihood approach to estimate an intermediate parameter, defined through a set of reweighted estimating equations. The key of our approach is to construct the proper weight, so that the first-order asymptotic bias of the estimator can be removed by orthogonal projection. Moreover, we propose an approximate inference framework for uncertainty quantification. Theoretically, we establish the first-order and second-order asymptotic bias and the convergence rate of our estimator. In addition, we characterize the accuracy of the Gaussian approximation of our estimator via the Berry-Esseen bound, which justifies the validity of the proposed approximate inference approach. Extensive simulations and real data analysis results show that G-HIVE is feasibly implementable and can outperform the baseline method that ignores hidden variables.

Problem

Research questions and friction points this paper is trying to address.

Estimating parameters in multivariate GLMs with hidden variables

Addressing identification challenges from unobserved confounding factors

Providing statistical inference with uncertainty quantification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modified quasi-likelihood approach for parameter estimation

Orthogonal projection to remove asymptotic bias

Approximate inference framework with Gaussian approximation

🔎 Similar Papers

Adaptive debiased SGD in high-dimensional GLMs with streaming data