SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model

📅 2026-01-21
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the first lightweight, interpretable, and spatially aware foundation model to address the challenge of modeling location-regulated gene expression in spatial transcriptomics. Built upon a graph convolutional network, the method integrates masked centroid prediction with unsupervised embedding learning to derive spatially coherent gene representations from multi-organ Visium data. It further enables in silico perturbation analyses to uncover directional ligand–receptor interactions and regulatory relationships. The model successfully recovers expression for 91% of masked genes, outperforms methods such as MOFA in clustering performance, achieves 81% accuracy in pathological annotation of oropharyngeal squamous cell carcinoma, and significantly enhances subtype prediction in glioblastoma.

Technology Category

Application Category

📝 Abstract
Spatial transcriptomics enables spatial gene expression profiling, motivating computational models that capture spatially conditioned regulatory relationships. We introduce SAGE-FM, a lightweight spatial transcriptomics foundation model based on graph convolutional networks (GCN) trained with a masked-central-spot prediction objective. Trained on 416 human Visium samples spanning 15 organs, SAGE-FM learns spatially coherent embeddings that recover masked genes robustly, with 91% of masked genes showing significant correlations (p < 0.05). The SAGE-FM generated embeddings outperform MOFA and spatial transcriptomics in unsupervised clustering and preservation of biological heterogeneity. SAGE-FM generalizes to downstream tasks, enabling 81% accuracy in pathologist-defined spot annotation in oropharyngeal squamous cell carcinoma and improving glioblastoma subtype prediction relative to MOFA. In silico perturbation experiments further show that the model captures directional ligand–receptor and upstream–downstream regulatory effects consistent with ground truth. These results demonstrate that simple, parameter-efficient GCNs can serve as biologically interpretable and spatially aware foundation models for large-scale spatial transcriptomics.
Problem

Research questions and friction points this paper is trying to address.

spatial transcriptomics
foundation model
gene regulation
spatial context
biological heterogeneity
Innovation

Methods, ideas, or system contributions that make the work stand out.

spatial transcriptomics
graph convolutional networks
foundation model
masked prediction
biological interpretability
🔎 Similar Papers
No similar papers found.
Xianghao Zhan
Xianghao Zhan
Meta, Stanford University, Samsung Research America, Zhejiang University
Traumatic Brain InjuryBCIHealth SensorsBiomedical InformaticsML Uncertainty
J
Jingyu Xu
Division of Computational Medicine, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Division of Immunology and Rheumatology, Department of Medicine, Stanford University, Stanford, CA 94305, USA
Y
Yuanning Zheng
Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Division of Computational Medicine, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Division of Immunology and Rheumatology, Department of Medicine, Stanford University, Stanford, CA 94305, USA
Z
Zinaida Good
Division of Computational Medicine, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Parker Institute for Cancer Immunotherapy, Stanford University, Stanford, CA 94305, USA; Team PROMISE, Weill Cancer Hub West, Stanford University, Stanford, CA 94305, USA
O
O. Gevaert
Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Division of Computational Medicine, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Team PROMISE, Weill Cancer Hub West, Stanford University, Stanford, CA 94305, USA