๐ค AI Summary
This study addresses the limitations of existing methods for predicting spatial gene expression from H&E-stained histology slides, which are often compromised by inter-slide staining variability and regression-induced smoothing that obscure biologically meaningful expression heterogeneity. The authors propose a two-stage framework: during training, it jointly optimizes structural-aware representation learning, symmetric imageโexpression alignment, and coordinate-guided spatial topological regularization; at inference, it employs a lightweight, backbone-agnostic post-calibration module to enable robust cross-slide prediction without fine-tuning. By integrating topology-preserving cross-modal learning with non-parametric posterior calibration, the approach supports stable neighborhood retrieval and controllable bias correction without relying on a single predictive pathway. Evaluated on three datasets, the method substantially outperforms current state-of-the-art approaches, achieving up to 9.8% and 39.5% relative improvements in PCC (ACG) over HAGE and mclSTExp, respectively, along with markedly reduced MSE and MAE.
๐ Abstract
Spatial transcriptomics (ST) enables spatially resolved gene profiling but remains expensive and low-throughput, limiting large-cohort studies and routine clinical use. Predicting spatial gene expression from routine hematoxylin and eosin (H&E) slides is a promising alternative, yet under realistic leave-one-slide-out evaluation, existing models often suffer from slide-level appearance shifts and regression-driven over-smoothing that suppress biologically meaningful variation. CHRep is a two-phase framework for robust histology-to-expression prediction. In the training phase, CHRep learns a structure-aware representation by jointly optimizing correlation-aware regression, symmetric image-expression alignment, and coordinate-induced spatial topology regularization. In the inference phase, cross-slide robustness is improved without backbone fine-tuning through a lightweight calibration module trained on the training slides, which combines a non-parametric estimate from a training gallery with a magnitude-regularized correction module. Unlike prior embedding-alignment or retrieval-based transfer methods that rely on a single prediction route, CHRep couples topology-preserving representation learning with post-hoc calibration, enabling stable neighborhood retrieval and controlled bias correction under slide-level shifts. Across the three cohorts, CHRep consistently improves gene-wise correlation under leave-one-slide-out evaluation, with the largest gains observed on Alex+10x. Relative to HAGE, the Pearson correlation coefficient on all considered genes [PCC(ACG)] increases by 4.0% on cSCC and 9.8% on HER2+. Relative to mclSTExp, PCC(ACG) further improves by 39.5% on Alex+10x, together with 9.7% and 9.0% reductions in mean squared error (MSE) and mean absolute error (MAE), respectively.