Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics

πŸ“… 2024-04-19
πŸ›οΈ International Conference on Medical Image Computing and Computer-Assisted Intervention
πŸ“ˆ Citations: 7
✨ Influential: 1
πŸ“„ PDF
πŸ€– AI Summary
Spatial transcriptomics (ST) suffers from low spatial resolution, hindering fine-grained gene expression profiling; existing super-resolution methods often exhibit reconstruction uncertainty and mode collapse when integrating histological images with gene expression data. To address this, we propose the first cross-modal conditional diffusion model for ST super-resolution, integrating a multimodal disentanglement network with a cross-modal adaptive modulation mechanism. We design dynamic cross-attention to hierarchically model cell–tissue structural relationships and introduce a co-expression gene graph neural network to capture multi-gene synergistic interactions. Evaluated on three public datasets, our method significantly outperforms state-of-the-art approaches, effectively mitigating mode collapse while enhancing both spatial resolution of gene expression maps and biological interpretability.

Technology Category

Application Category

πŸ“ Abstract
The recent advancement of spatial transcriptomics (ST) allows to characterize spatial gene expression within tissue for discovery research. However, current ST platforms suffer from low resolution, hindering in-depth understanding of spatial gene expression. Super-resolution approaches promise to enhance ST maps by integrating histology images with gene expressions of profiled tissue spots. However, current super-resolution methods are limited by restoration uncertainty and mode collapse. Although diffusion models have shown promise in capturing complex interactions between multi-modal conditions, it remains a challenge to integrate histology images and gene expression for super-resolved ST maps. This paper proposes a cross-modal conditional diffusion model for super-resolving ST maps with the guidance of histology images. Specifically, we design a multi-modal disentangling network with cross-modal adaptive modulation to utilize complementary information from histology images and spatial gene expression. Moreover, we propose a dynamic cross-attention modelling strategy to extract hierarchical cell-to-tissue information from histology images. Lastly, we propose a co-expression-based gene-correlation graph network to model the co-expression relationship of multiple genes. Experiments show that our method outperforms other state-of-the-art methods in ST super-resolution on three public datasets.
Problem

Research questions and friction points this paper is trying to address.

Enhancing low-resolution spatial transcriptomics using histology images
Overcoming restoration uncertainty and mode collapse in super-resolution
Integrating multi-modal data for improved gene expression mapping
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-modal diffusion model for super-resolved spatial transcriptomics
Multi-modal network with cross-modal adaptive modulation
Dynamic cross-attention modeling for hierarchical tissue information
πŸ”Ž Similar Papers
No similar papers found.
X
Xiaofei Wang
Department of Clinical Neurosciences, University of Cambridge, UK
X
Xingxu Huang
Zhejiang Lab, China
S
Stephen J. Price
Department of Clinical Neurosciences, University of Cambridge, UK
C
Chao Li
Department of Applied Mathematics and Theoretical Physics, University of Cambridge, UK