🤖 AI Summary
Traditional genome-wide association studies (GWAS) struggle to uncover causal disease mechanisms, while existing knowledge graph–enhanced GWAS (KGWAS) approaches rely on generic knowledge graphs that often introduce spurious associations. To address this limitation, this work proposes a novel KGWAS framework that integrates perturb-seq–derived, cell-type-specific gene interaction networks to construct context-specific knowledge graphs, replacing generic ones. By combining knowledge graph pruning, data-driven modeling of gene relationships, and rigorous statistical inference, the method significantly enhances the sparsity, biological coherence, and robustness of identified disease pathways—without compromising statistical power.
📝 Abstract
Genome-Wide Association Studies (GWAS) identify associations between genetic variants and disease; however, moving beyond associations to causal mechanisms is critical for therapeutic target prioritization. The recently proposed Knowledge Graph GWAS (KGWAS) framework addresses this challenge by linking genetic variants to downstream gene-gene interactions via a knowledge graph (KG), thereby improving detection power and providing mechanistic insights. However, the original KGWAS implementation relies on a large general-purpose KG, which can introduce spurious correlations. We hypothesize that cell-type specific KGs from disease-relevant cell types will better support disease mechanism discovery. Here, we show that the general-purpose KG in KGWAS can be substantially pruned with no loss of statistical power on downstream tasks, and that performance further improves by incorporating gene-gene relationships derived from perturb-seq data. Importantly, using a sparse, context-specific KG from direct perturb-seq evidence yields more consistent and biologically robust disease-critical networks.