Empirical Bayes Data Integration for Multi-Response Regression

📅 2026-02-14
🏛️ Statistica sinica
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of covariance estimation in transcriptome-wide association studies (TWAS) involving multi-source vector-valued responses, particularly when the underlying parameter structure—such as sparsity/density or low-rank/non-low-rank—is unknown. The authors propose a linear shrinkage estimator grounded in empirical Bayes methodology, which effectively integrates high-dimensional multi-response regression information by applying both global and local shrinkage to the singular values of the data matrix. This approach transcends the limitations of conventional assumptions that enforce either sparsity or low-rank structures, offering enhanced flexibility and computational scalability. Under a specific loss function, the estimator achieves asymptotic optimality. Both numerical simulations and real-data analyses using GTEx TWAS data demonstrate its superior estimation accuracy and practical utility.

Technology Category

Application Category

📝 Abstract
Motivated by applications in tissue-wide association studies (TWAS), we develop a flexible and theoretically grounded empirical Bayes approach for integrating %vector-valued outcomes data obtained from different sources. We propose a linear shrinkage estimator that effectively shrinks singular values of a data matrix. This problem is closely connected to estimating covariance matrices under a specific loss, for which we develop asymptotically optimal estimators. The basic linear shrinkage estimator is then extended to a local linear shrinkage estimator, offering greater flexibility. Crucially, the proposed method works under sparse/dense or low-rank/non low-rank parameter settings unlike well-known sparse or reduced rank estimators in the literature. Furthermore, the empirical Bayes approach offers greater scalability in computation compared to intensive full Bayes procedures. The method is evaluated through an extensive set of numerical experiments, and applied to a real TWAS data obtained from the Genotype-Tissue Expression (GTEx) project.
Problem

Research questions and friction points this paper is trying to address.

multi-response regression
data integration
covariance estimation
singular value shrinkage
empirical Bayes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical Bayes
multi-response regression
linear shrinkage
covariance estimation
low-rank
🔎 Similar Papers
No similar papers found.
A
Antik Chakraborty
Department of Statistics, Purdue University
Fei Xue
Fei Xue
Purdue University
Statisticsbiostatistics