🤖 AI Summary
This study addresses the challenge of non-invasively and cost-effectively inferring spatial gene expression profiles from routine hematoxylin and eosin (H&E)-stained histopathological images, thereby bridging the gap between morphological and molecular information. We propose the first histology-enhanced contrastive learning framework, featuring a dual-branch encoder that decouples representation learning from expression reconstruction. Key innovations include an image-centric contrastive loss, self-supervised representation alignment, and a cross-modal mapping module—collectively reducing reliance on scarce paired multimodal data. Evaluated across multiple public spatial transcriptomics datasets, our method outperforms existing state-of-the-art approaches, generating spatially continuous, biologically plausible, and clinically interpretable gene expression predictions. The framework is end-to-end trainable and deployable, offering practical utility for integrative spatial omics analysis.
📝 Abstract
Histopathology, particularly hematoxylin and eosin (H&E) staining, plays a critical role in diagnosing and characterizing pathological conditions by highlighting tissue morphology. However, H&E-stained images inherently lack molecular information, requiring costly and resource-intensive methods like spatial transcriptomics to map gene expression with spatial resolution. To address these challenges, we introduce HECLIP (Histology-Enhanced Contrastive Learning for Imputation of Profiles), an innovative deep learning framework that bridges the gap between histological imaging and molecular profiling. HECLIP is specifically designed to infer gene expression profiles directly from H&E-stained images, eliminating the need for expensive spatial transcriptomics assays. HECLIP leverages an advanced image-centric contrastive loss function to optimize image representation learning, ensuring that critical morphological patterns in histology images are effectively captured and translated into accurate gene expression profiles. This design enhances the predictive power of the image modality while minimizing reliance on gene expression data. Through extensive benchmarking on publicly available datasets, HECLIP demonstrates superior performance compared to existing approaches, delivering robust and biologically meaningful predictions. Detailed ablation studies further underscore its effectiveness in extracting molecular insights from histology images. Additionally, HECLIP's scalable and cost-efficient approach positions it as a transformative tool for both research and clinical applications, driving advancements in precision medicine. The source code for HECLIP is openly available at https://github.com/QSong-github/HECLIP.