🤖 AI Summary
This work addresses the challenge of predicting genome-wide RNA-seq expression profiles from whole-slide histopathology images (WSIs), a task where existing deterministic regression approaches fail to capture biological heterogeneity and predictive uncertainty. The authors propose the first application of flow-matching generative models to this cross-modal prediction problem, formulating transcriptomic inference as a morphology-conditioned continuous-time optimal transport task. By learning a conditional velocity field that transports a prior distribution to the target gene expression distribution, the method incorporates pathway-level gene structural priors and is trained on large-scale paired WSI–RNA datasets. The approach significantly outperforms current state-of-the-art methods, achieving higher prediction accuracy while providing biologically interpretable and reliable uncertainty quantification.
📝 Abstract
Histopathology whole-slide images (WSIs) are routinely acquired in clinical practice and contain rich tissue morphology but lack direct molecular architecture and functional programs defining pathological states, whereas RNA sequencing (RNA-seq) provides genome-wide transcriptional profiles at substantial cost, thereby motivating WSI-based genome-wide transcriptomic prediction. Existing approaches for predicting gene expression from WSIs predominantly rely on deterministic regression with one-to-one mapping, limiting their ability to capture biological heterogeneity and predictive uncertainty. We propose RNA-FM, a flow-matching generative framework for genome-wide bulk RNA-seq prediction from WSIs. RNA-FM formulates transcriptomic prediction as a continuous-time conditional transport problem, learning a velocity field that maps a simple prior to the target gene expression distribution conditioned on morphologies. By integrating pathway-level structure, RNA-FM enables scalable and biologically interpretable genome-wide gene expression imputation. Extensive experiments demonstrate that RNA-FM consistently outperforms state-of-the-art approaches while maintaining biological meaningfulness. Code is available at https://github.com/YXSong000/RNA-FM.