🤖 AI Summary
This study addresses the limitations of the ASSET method in subset-based meta-analysis, which relies on normality assumptions to compute p-values and whose analytical approximations become inaccurate under extreme tail probabilities or non-normal conditions—such as small sample sizes or low-frequency variants—while conventional Monte Carlo simulations incur prohibitive computational costs. The work presents the first systematic evaluation of ASSET’s accuracy in estimating tail p-values and introduces an efficient importance sampling (IS) algorithm that accurately estimates extremely small p-values in both independent and overlapping study designs. The proposed method maintains high precision even under non-normality and demonstrates substantial gains in computational efficiency. Its practical utility is validated through applications to the OneK1K dataset and a Korean lung cell single-cell eQTL analysis.
📝 Abstract
Pooling genome-wide association studies of multiple related traits can substantially increase power for detecting genetic variants with pleiotropic effects. ASSET, which exhaustively searches all subsets of studies for association signals, has been widely used to detect modest effects and improve interpretability. Under a normality assumption, ASSET computes p-values via an analytic approximation that accounts for multiple testing. However, this approximation has been evaluated only in limited scenarios and for p-values no smaller than $10^{-3}$. A systematic assessment in the extreme tail is therefore needed, yet naïve Monte Carlo methods would require prohibitively many simulations. We develop a computationally efficient importance-sampling (IS) algorithm that provides accurate ASSET p-value estimates for both independent and overlapping studies, achieving substantial efficiency gains over naïve Monte Carlo, particularly for very small p-values. Using IS, we show that ASSET's analytic approximation is highly accurate across nearly the entire p-value range when normality holds. In contrast, when normality is violated (due to small sample sizes, low-frequency variants, or non-normal traits), ASSET p-values can be inflated or deflated by orders of magnitude, whereas our IS approach remains accurate. We illustrate the method through applications to single-cell eQTL mapping using peripheral blood mononuclear cells from the OneK1K cohort and lung cells from a Korean population.