🤖 AI Summary
This work addresses the challenge of improving causal variant detection in genomic regions with strong linkage disequilibrium while rigorously controlling the false discovery rate (FDR) by leveraging functional annotation data. To this end, we propose AnnoKn, a novel method that systematically integrates functional annotations as covariate priors within the knockoffs framework. AnnoKn combines adaptive Lasso regularization with Bayesian modeling to enable annotation-informed variable selection and is uniquely designed to operate using only GWAS summary statistics. Extensive experiments on GTEx and large-scale GWAS datasets demonstrate that AnnoKn substantially enhances power for identifying causal variants while maintaining strict FDR control, outperforming existing state-of-the-art approaches.
📝 Abstract
Genome-wide association studies (GWAS) often find association signals between many genetic variants and traits of interest in a genomic region. Functional annotations of these variants provide valuable prior information that helps prioritize biologically relevant variants and enhances the power to detect causal variants. However, due to substantial correlations among these variants, a critical question is how to rigorously control the false discovery rate while effectively leveraging prior knowledge. We introduce annotation-informed knockoffs (AnnoKn), a knockoff-based method that performs annotation-informed variable selection with strict control of the false discovery rate. AnnoKn integrates the knockoff procedure with adaptive Lasso regression to evaluate the importance of multiple covariates while incorporating functional annotation information within a unified Bayesian framework. To facilitate real-world applications where individual-level data are not accessible, we further extend AnnoKn to operate on summary statistics. Through simulations and real-world applications to GTEx and GWAS datasets, we show that AnnoKn achieves superior power in detecting causal genetic variants compared with existing annotation-informed variable selection methods, while maintaining valid control over false discoveries.