Adaptive Multi-Prior Lasso for High-Dimensional Generalized Linear Models

📅 2026-04-16

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This study addresses the challenge of integrating heterogeneous prior information in high-dimensional generalized linear models, where existing methods struggle to effectively assess and weight priors of varying quality. The authors propose an adaptive multi-prior Lasso approach that, for the first time, enables data-driven, dynamic weighting of prior information within a unified regularization framework. This method automatically reinforces reliable priors while suppressing unreliable ones, all with theoretical guarantees. As demonstrated through simulations and real-world analysis of TCGA breast cancer gene expression data, the proposed technique substantially improves variable selection accuracy, estimation efficiency, and predictive performance.

Technology Category

Application Category

📝 Abstract

Incorporation of external information into high-dimensional modeling for gene expression data has been shown, both theoretically and empirically, to substantially enhance performance. Such external information, sometimes referred to as prior information or priors, has become increasingly accessible from multiple sources, yet its reliability may vary considerably. Existing approaches often integrate these priors without sufficiently accounting for their quality, which may result in unsatisfactory or even misleading results. To effectively and selectively exploit such priors, we propose adaptive Multi-Prior Lasso, a novel regularization approach that simultaneously identifies reliable prior sources and integrates them to improve model performance. For high-dimensional generalized linear models (GLMs), an adaptive data-driven weight is assigned to each prior, so that more reliable sources are emphasized while less credible ones are downweighted. Theoretical guarantees are established, and the proposed method is shown through extensive simulations to improve estimation, prediction, and variable selection. An application to TCGA breast cancer gene expression data further illustrates the practical value of the proposed method, showing that incorporating prior information from PubMed published studies improves model performance.

Problem

Research questions and friction points this paper is trying to address.

high-dimensional generalized linear models

prior information

reliability of priors

gene expression data

variable selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Multi-Prior Lasso

high-dimensional GLMs

prior reliability