🤖 AI Summary
This study addresses the challenge of clustering gene expression in spatial transcriptomics data while jointly preserving spatial structure and expression similarity. We propose a Gaussian process mixture model incorporating spatial regularization. Methodologically, we innovatively employ a generalized Cholesky decomposition to construct a sparse precision matrix, thereby avoiding misspecification of the covariance kernel and explicitly modeling non-stationary spatial correlations. Coupled with a Gaussian mixture model, our framework enables joint spatial–expression clustering of genes. Experiments on tumor microenvironment data demonstrate that the method robustly identifies spatially coherent gene clusters with region-specific biological relevance—such as tumor core, invasive front, and stromal regions—substantially enhancing biological interpretability. Compared to existing approaches, our method requires no pre-specified stationary covariance function and exhibits superior adaptability to spatial heterogeneity.
📝 Abstract
Spatial transcriptomics measures the expression of thousands of genes in a tissue sample while preserving its spatial structure. This class of technologies has enabled the investigation of the spatial variation of gene expressions and their impact on specific biological processes. Identifying genes with similar expression profiles is of utmost importance, thus motivating the development of flexible methods leveraging spatial data structure to cluster genes. Here, we propose a modeling framework for clustering observations measured over numerous spatial locations via Gaussian processes. Rather than specifying their covariance kernels as a function of the spatial structure, we use it to inform a generalized Cholesky decomposition of their precision matrices. This approach prevents issues with kernel misspecification and facilitates the estimation of a non-stationarity spatial covariance structure. Applied to spatial transcriptomic data, our model identifies gene clusters with distinctive spatial correlation patterns across tissue areas comprising different cell types, like tumoral and stromal areas.