Spatial Transcriptomics Expression Prediction from Histopathology Based on Cross-Modal Mask Reconstruction and Contrastive Learning

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High cost and scarcity of spatial transcriptomics (ST) data hinder broad adoption. To address this, we propose a novel cross-modal framework integrating masked image modeling and contrastive learning, enabling high-accuracy prediction of spatially resolved gene expression profiles directly from low-cost whole-slide images (WSIs)—the first such approach. Our method jointly leverages multi-scale histological feature extraction, spatially aware representation learning, and cross-modal contrastive alignment to preserve both per-gene prediction fidelity and inter-gene correlation structure. Evaluated across six disease cohorts, it achieves average improvements of 6.27%, 6.11%, and 11.26% in Pearson correlation coefficient (PCC) for highly expressed, highly variable, and biomarker genes, respectively. The model demonstrates strong few-shot generalization and successfully enables cancer tissue spatial localization. This work establishes a new paradigm for cost-effective, high-throughput spatial functional profiling.

Technology Category

Application Category

📝 Abstract
Spatial transcriptomics is a technology that captures gene expression levels at different spatial locations, widely used in tumor microenvironment analysis and molecular profiling of histopathology, providing valuable insights into resolving gene expression and clinical diagnosis of cancer. Due to the high cost of data acquisition, large-scale spatial transcriptomics data remain challenging to obtain. In this study, we develop a contrastive learning-based deep learning method to predict spatially resolved gene expression from whole-slide images. Evaluation across six different disease datasets demonstrates that, compared to existing studies, our method improves Pearson Correlation Coefficient (PCC) in the prediction of highly expressed genes, highly variable genes, and marker genes by 6.27%, 6.11%, and 11.26% respectively. Further analysis indicates that our method preserves gene-gene correlations and applies to datasets with limited samples. Additionally, our method exhibits potential in cancer tissue localization based on biomarker expression.
Problem

Research questions and friction points this paper is trying to address.

Predicts spatial gene expression from histopathology images
Improves accuracy for highly expressed and variable genes
Works well with limited samples and cancer localization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-modal mask reconstruction for expression prediction
Contrastive learning enhances gene prediction accuracy
Deep learning analyzes whole-slide histopathology images
🔎 Similar Papers
No similar papers found.
Junzhuo Liu
Junzhuo Liu
University of Electronic Science and Technology of China
M
Markus Eckstein
Institute of Pathology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany; Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), Erlangen, Germany; Comprehensive Cancer Center Alliance WERA (CCC WERA), Erlangen, Germany; Bavarian Cancer Research Center (BZKF), Erlangen, Germany
Zhixiang Wang
Zhixiang Wang
University of Tokyo
Computational PhotographyComputational ImagingMachine Learning
Friedrich Feuerhake
Friedrich Feuerhake
Associate Professor of Neuropathology, Freiburg University
Systems MedicineDigital PathologyOncoimmunology
Dorit Merhof
Dorit Merhof
Professor, Faculty of Informatics and Computer Science, University of Regensburg