Joint identification of spatially variable genes via a network-assisted Bayesian regularization approach

📅 2024-07-07

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing spatial transcriptomics gene identification methods predominantly rely on marginal analysis, neglecting the structural information embedded in gene interaction networks and failing to correct for confounding variation arising from cellular composition heterogeneity. To address these limitations, we propose a Bayesian model incorporating thresholded graph Laplacian regularization, which jointly identifies spatially variable genes, integrates network topology, and adjusts for cellular composition confounders within a unified framework. The model employs a zero-inflated negative binomial distribution to accurately capture count data properties, imposes a graph Laplacian prior to encode gene–gene interactions, and leverages network-guided regularization to enhance sparsity and biological interpretability. Evaluations on both simulated and real spatial transcriptomics datasets demonstrate substantial improvements in detection accuracy and robust identification of co-varying gene modules, thereby overcoming the fundamental constraints of conventional marginal approaches.

Technology Category

Application Category

📝 Abstract

Identifying genes that display spatial patterns is critical to investigating expression interactions within a spatial context and further dissecting biological understanding of complex mechanistic functionality. Despite the increase in statistical methods designed to identify spatially variable genes, they are mostly based on marginal analysis and share the limitation that the dependence (network) structures among genes are not well accommodated, where a biological process usually involves changes in multiple genes that interact in a complex network. Moreover, the latent cellular composition within spots may introduce confounding variations, negatively affecting identification accuracy. In this study, we develop a novel Bayesian regularization approach for spatial transcriptomic data, with the confounding variations induced by varying cellular distributions effectively corrected. Significantly advancing from the existing studies, a thresholded graph Laplacian regularization is proposed to simultaneously identify spatially variable genes and accommodate the network structure among genes. The proposed method is based on a zero-inflated negative binomial distribution, effectively accommodating the count nature, zero inflation, and overdispersion of spatial transcriptomic data. Extensive simulations and the application to real data demonstrate the competitive performance of the proposed method.

Problem

Research questions and friction points this paper is trying to address.

Identifying spatially variable genes while incorporating gene network dependencies

Correcting confounding variations from latent cellular composition in spots

Modeling count nature with zero inflation and overdispersion in transcriptomic data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian regularization corrects cellular distribution variations

Thresholded graph Laplacian incorporates gene network structure

Zero-inflated negative binomial models count nature of data

🔎 Similar Papers

Distance-Preserving Spatial Representations in Genomic Data