Bayesian variable selection in a Cox proportional hazards model with the"Sum of Single Effects"prior

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Bayesian variable selection for Cox proportional hazards models faces significant challenges in genetic fine-mapping due to highly correlated covariates (|r| > 0.99) and ultra-large-scale data (hundreds of thousands of samples, thousands of covariates). Method: This paper introduces SuSiE-Cox—the first extension of the Sum of Single Effects (SuSiE) framework to the Cox model—enabling scalable, approximate posterior inference for causal SNP identification. Contribution/Results: SuSiE-Cox achieves both statistical interpretability and computational feasibility. Applied to UK Biobank asthma data, it precisely localized 14 SNPs across eight risk loci; six exhibited posterior inclusion probabilities (PIPs) > 50%, two corresponded to known pathogenic variants, and several were implicated in regulating *GATA3* expression. These results demonstrate robustness under strong genetic correlation and biological plausibility.

Technology Category

Application Category

📝 Abstract
Motivated by genetic fine-mapping applications, we introduce a new approach to Bayesian variable selection regression (BVSR) for time-to-event (TTE) outcomes. This new approach is designed to deal with the specific challenges that arise in genetic fine-mapping, including: the presence of very strong correlations among the covariates, often exceeding 0.99; very large data sets containing potentially thousands of covariates and hundreds of thousands of samples. We accomplish this by extending the"Sum of Single Effects"(SuSiE) method to the Cox proportional hazards (CoxPH) model. We demonstrate the benefits of the new method,"CoxPH-SuSiE", over existing BVSR methods for TTE outcomes in simulated fine-mapping data sets. We also illustrate CoxPH-SuSiE on real data by fine-mapping asthma loci using data from UK Biobank. This fine-mapping identified 14 asthma risk SNPs in 8 asthma risk loci, among which 6 had strong evidence for being causal (posterior inclusion probability greater than 50%). Two of the 6 putatively causal variants are known to be pathogenic, and others lie within a genomic sequence that is known to regulate the expression of GATA3.
Problem

Research questions and friction points this paper is trying to address.

Extends SuSiE to CoxPH for genetic fine-mapping challenges
Handles high covariate correlations and large datasets
Identifies causal SNPs in time-to-event genetic studies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends SuSiE to CoxPH model
Handles high covariate correlations
Scales to large genetic datasets
🔎 Similar Papers
No similar papers found.
Y
Yunqi Yang
Committee on Genetics, Genomics and System Biology, University of Chicago, Chicago, IL
Karl Tayeb
Karl Tayeb
University of Chicago
Peter Carbonetto
Peter Carbonetto
University of Chicago
Quantitative genetics
X
Xiaoyuan Zhong
Department of Human Genetics, University of Chicago, Chicago, IL
Carole Ober
Carole Ober
Department of Human Genetics, University of Chicago, Chicago, IL
Matthew Stephens
Matthew Stephens
University of Chicago
StatisticsGenetics