MMAP: A Multi-Magnification and Prototype-Aware Architecture for Predicting Spatial Gene Expression

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This study addresses the cross-modal gap in predicting spatial gene expression from H&E-stained whole-slide images—specifically, semantic misalignment between visual features and molecular signals, insufficient local texture resolution, and inadequate global spatial context modeling. To this end, we propose a multi-scale feature fusion strategy coupled with a sliding-window prototype enhancement mechanism: parallel extraction of multi-resolution image patch representations refines local structural perception, while prototype embedding learning captures tissue-level spatial dependencies, enabling controllable-granularity global context aggregation. Our method employs an end-to-end deep regression framework to predict genome-wide expression profiles directly from histology. Evaluated on multiple public spatial transcriptomics datasets, it achieves significant improvements over state-of-the-art methods across MAE, MSE, and Pearson correlation coefficient (PCC), demonstrating superior cross-modal mapping capability and biological interpretability.

Technology Category

Application Category

📝 Abstract

Spatial Transcriptomics (ST) enables the measurement of gene expression while preserving spatial information, offering critical insights into tissue architecture and disease pathology. Recent developments have explored the use of hematoxylin and eosin (H&E)-stained whole-slide images (WSIs) to predict transcriptome-wide gene expression profiles through deep neural networks. This task is commonly framed as a regression problem, where each input corresponds to a localized image patch extracted from the WSI. However, predicting spatial gene expression from histological images remains a challenging problem due to the significant modality gap between visual features and molecular signals. Recent studies have attempted to incorporate both local and global information into predictive models. Nevertheless, existing methods still suffer from two key limitations: (1) insufficient granularity in local feature extraction, and (2) inadequate coverage of global spatial context. In this work, we propose a novel framework, MMAP (Multi-MAgnification and Prototype-enhanced architecture), that addresses both challenges simultaneously. To enhance local feature granularity, MMAP leverages multi-magnification patch representations that capture fine-grained histological details. To improve global contextual understanding, it learns a set of latent prototype embeddings that serve as compact representations of slide-level information. Extensive experimental results demonstrate that MMAP consistently outperforms all existing state-of-the-art methods across multiple evaluation metrics, including Mean Absolute Error (MAE), Mean Squared Error (MSE), and Pearson Correlation Coefficient (PCC).

Problem

Research questions and friction points this paper is trying to address.

Predicting spatial gene expression from histological images

Bridging modality gap between visual features and molecular signals

Enhancing local granularity and global context in prediction models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multi-magnification patches for fine-grained details

Learns latent prototype embeddings for global context

Combines local and global features to predict expression

🔎 Similar Papers

No similar papers found.