🤖 AI Summary
Current spatial omics clustering methods often suffer from over-smoothing that blurs biological boundaries, reliance on pre-specified cluster numbers, and a lack of effective multi-omics integration mechanisms. To address these limitations, this work proposes BaySC—a Bayesian inference–based spatial clustering framework that uniquely integrates a mixture of finite mixtures (MFM) model with a Markov random field (MRF) to automatically infer the number of spatial domains while preserving local spatial consistency. Furthermore, BaySC introduces an interpretable weighted log-likelihood fusion strategy to quantify the contribution of each omics modality to the resulting tissue atlas. Extensive experiments on ten single-modality and two multi-modality datasets demonstrate that BaySC significantly outperforms existing methods in both clustering accuracy and preservation of spatial topology, as measured by the spARI metric.
📝 Abstract
Spatial domain identification requires jointly modeling molecular signatures and physical coordinates, yet current tools frequently over-smooth biological boundaries, require user-specified cluster numbers, and lack principled multimodal integration. We introduce BaySC, an integrative Bayesian spatial clustering framework for spatial domain identification. BaySC inherently learns the true number of spatial domains from the data by employing a Mixture of Finite Mixtures (MFM) prior. Tissue topology is modeled via a Markov Random Field (MRF) applied to discrete cellular assignments, a strategy that enforces local spatial coherence without distorting the underlying gene expression features. This enables BaySC to accurately map contiguous tissue layers as well as geographically scattered, transcriptionally identical cell populations. Furthermore, BaySC handles spatial multi-omics data through a weighted log-likelihood fusion mechanism executed via Gibbs sampling. This approach assigns interpretable weights to each modality, allowing users to quantify the biological relevance of different data layers to the final tissue map. Validated across ten single-modal spatial transcriptomics and two spatial multi-omics datasets, BaySC yields highly interpretable probabilistic outputs. It demonstrates competitive accuracy on standard clustering metrics and consistently outperforms existing tools in preserving spatial topography, as measured by spatially-aware Adjusted Rand Index (spARI).