Asymptotic Inference for Constrained Regression

📅 2025-12-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses asymptotic statistical inference for high-dimensional regression parameters subject to affine constraints, motivated by genetic applications: estimating the causal effect of genetic factors on a continuous diabetes phenotype using protein expression as a mediator, where the effect must satisfy linear constraints derived from protein genetic determinants and protein–phenotype genetic associations. We propose a class of convex optimization–based constrained estimators. Within a proportional asymptotic regime, we establish, for the first time, an asymptotically normal theory with sharp large-sample optimality—rigorously characterizing the bias–variance trade-off while ensuring consistency, optimal convergence rates, and valid confidence intervals. Our method explicitly incorporates external biological priors (e.g., protein-mediated pathways). In both simulations and real genetic data, it substantially outperforms unconstrained benchmarks, achieving both theoretical rigor and numerical stability.

Technology Category

Application Category

📝 Abstract
We consider statistical inference in high-dimensional regression problems under affine constraints on the parameter space. The theoretical study of this is motivated by the study of genetic determinants of diseases, such as diabetes, using external information from mediating protein expression levels. Specifically, we develop rigorous methods for estimating genetic effects on diabetes-related continuous outcomes when these associations are constrained based on external information about genetic determinants of proteins, and genetic relationships between proteins and the outcome of interest. In this regard, we discuss multiple candidate estimators and study their theoretical properties, sharp large sample optimality, and numerical qualities under a high-dimensional proportional asymptotic framework.
Problem

Research questions and friction points this paper is trying to address.

Develops inference methods for high-dimensional constrained regression
Estimates genetic effects on diabetes using protein expression constraints
Analyzes estimators' theoretical properties and optimality in large samples
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constrained regression with affine parameter constraints
High-dimensional inference using external protein information
Asymptotic optimality under proportional growth framework
🔎 Similar Papers
No similar papers found.
M
Madhav Sankaranarayanan
Department of Biostatistics, Harvard T.H. Chan School of Public Health
Y
Yana Hrytsenko
Cardiovascular Institute, Beth Israel Deaconess Medical Center
J
Jerome I. Rotter
The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
Tamar Sofer
Tamar Sofer
Cardiovascular Institute, Beth Israel Deaconess Medical Center
Rajarshi Mukherjee
Rajarshi Mukherjee
Associate Professor, Biostatistics, Harvard University