🤖 AI Summary
The “missing heritability” problem in complex diseases remains unresolved, and existing methods exhibit limited power for detecting joint effects of common and rare genetic variants. To address this, we propose Functional ANOVA (FANOVA), a novel association testing framework that integratively incorporates linkage disequilibrium structure, genomic position information, and functional annotations—while accommodating multi-directional causal effects. Grounded in functional data analysis, FANOVA employs an ANOVA-based strategy for qualitative trait association inference. Simulation studies and real-data analyses demonstrate that FANOVA achieves superior statistical power compared to SKAT and the Functional Linear Model (FLM), particularly under small sample sizes and weak effect scenarios. In a genome-wide analysis of obesity, FANOVA successfully identified two biologically validated susceptibility genes—*ANGPTL4* and *ANGPTL3*. Thus, FANOVA provides a powerful, robust, and interpretable tool for dissecting the genetic architecture of complex diseases.
📝 Abstract
While progress has been made in identifying common genetic variants associated with human diseases, for most of common complex diseases, the identified genetic variants only account for a small proportion of heritability. Challenges remain in finding additional unknown genetic variants predisposing to complex diseases. With the advance in next-generation sequencing technologies, sequencing studies have become commonplace in genetic research. The ongoing exome-sequencing and whole-genome-sequencing studies generate a massive amount of sequencing variants and allow researchers to comprehensively investigate their role in human diseases. The discovery of new disease-associated variants can be enhanced by utilizing powerful and computationally efficient statistical methods. In this paper, we propose a functional analysis of variance (FANOVA) method for testing an association of sequence variants in a genomic region with a qualitative trait. The FANOVA has a number of advantages: (1) it tests for a joint effect of gene variants, including both common and rare; (2) it fully utilizes linkage disequilibrium and genetic position information; and (3) allows for either protective or risk-increasing causal variants. Through simulations, we show that FANOVA outperform two popularly used methods - SKAT and a previously proposed method based on functional linear models (FLM), - especially if a sample size of a study is small and/or sequence variants have low to moderate effects. We conduct an empirical study by applying three methods (FANOVA, SKAT and FLM) to sequencing data from Dallas Heart Study. While SKAT and FLM respectively detected ANGPTL 4 and ANGPTL 3 associated with obesity, FANOVA was able to identify both genes associated with obesity.