π€ AI Summary
Existing spatial transcriptomics gene identification methods predominantly rely on marginal analysis, neglecting the structural information embedded in gene interaction networks and failing to correct for confounding variation arising from cellular composition heterogeneity. To address these limitations, we propose a Bayesian model incorporating thresholded graph Laplacian regularization, which jointly identifies spatially variable genes, integrates network topology, and adjusts for cellular composition confounders within a unified framework. The model employs a zero-inflated negative binomial distribution to accurately capture count data properties, imposes a graph Laplacian prior to encode geneβgene interactions, and leverages network-guided regularization to enhance sparsity and biological interpretability. Evaluations on both simulated and real spatial transcriptomics datasets demonstrate substantial improvements in detection accuracy and robust identification of co-varying gene modules, thereby overcoming the fundamental constraints of conventional marginal approaches.
π Abstract
Identifying genes that display spatial patterns is critical to investigating expression interactions within a spatial context and further dissecting biological understanding of complex mechanistic functionality. Despite the increase in statistical methods designed to identify spatially variable genes, they are mostly based on marginal analysis and share the limitation that the dependence (network) structures among genes are not well accommodated, where a biological process usually involves changes in multiple genes that interact in a complex network. Moreover, the latent cellular composition within spots may introduce confounding variations, negatively affecting identification accuracy. In this study, we develop a novel Bayesian regularization approach for spatial transcriptomic data, with the confounding variations induced by varying cellular distributions effectively corrected. Significantly advancing from the existing studies, a thresholded graph Laplacian regularization is proposed to simultaneously identify spatially variable genes and accommodate the network structure among genes. The proposed method is based on a zero-inflated negative binomial distribution, effectively accommodating the count nature, zero inflation, and overdispersion of spatial transcriptomic data. Extensive simulations and the application to real data demonstrate the competitive performance of the proposed method.