Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics

📅 2024-04-19

🏛️ International Conference on Medical Image Computing and Computer-Assisted Intervention

📈 Citations: 7

✨ Influential: 1

career value

196K/year

🤖 AI Summary

Spatial transcriptomics (ST) suffers from low spatial resolution, hindering fine-grained gene expression profiling; existing super-resolution methods often exhibit reconstruction uncertainty and mode collapse when integrating histological images with gene expression data. To address this, we propose the first cross-modal conditional diffusion model for ST super-resolution, integrating a multimodal disentanglement network with a cross-modal adaptive modulation mechanism. We design dynamic cross-attention to hierarchically model cell–tissue structural relationships and introduce a co-expression gene graph neural network to capture multi-gene synergistic interactions. Evaluated on three public datasets, our method significantly outperforms state-of-the-art approaches, effectively mitigating mode collapse while enhancing both spatial resolution of gene expression maps and biological interpretability.

Technology Category

Application Category

📝 Abstract

The recent advancement of spatial transcriptomics (ST) allows to characterize spatial gene expression within tissue for discovery research. However, current ST platforms suffer from low resolution, hindering in-depth understanding of spatial gene expression. Super-resolution approaches promise to enhance ST maps by integrating histology images with gene expressions of profiled tissue spots. However, current super-resolution methods are limited by restoration uncertainty and mode collapse. Although diffusion models have shown promise in capturing complex interactions between multi-modal conditions, it remains a challenge to integrate histology images and gene expression for super-resolved ST maps. This paper proposes a cross-modal conditional diffusion model for super-resolving ST maps with the guidance of histology images. Specifically, we design a multi-modal disentangling network with cross-modal adaptive modulation to utilize complementary information from histology images and spatial gene expression. Moreover, we propose a dynamic cross-attention modelling strategy to extract hierarchical cell-to-tissue information from histology images. Lastly, we propose a co-expression-based gene-correlation graph network to model the co-expression relationship of multiple genes. Experiments show that our method outperforms other state-of-the-art methods in ST super-resolution on three public datasets.

Problem

Research questions and friction points this paper is trying to address.

Enhancing low-resolution spatial transcriptomics using histology images

Overcoming restoration uncertainty and mode collapse in super-resolution

Integrating multi-modal data for improved gene expression mapping

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-modal diffusion model for super-resolved spatial transcriptomics

Multi-modal network with cross-modal adaptive modulation

Dynamic cross-attention modeling for hierarchical tissue information

🔎 Similar Papers

No similar papers found.