Bridging the Modality Bottleneck in Pathology MIL through Virtual Molecular Staining

πŸ“… 2026-05-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

198K/year
πŸ€– AI Summary
This work addresses the limitation of existing multiple instance learning (MIL) approaches in computational pathology, which rely solely on morphological features from H&E-stained images and struggle to model endpoint tasks governed by molecular statesβ€”such as survival outcomes, biomarker status, or molecular subtypes. To overcome this, the authors propose MIST, a novel method that introduces a molecular-informed virtual staining mechanism into MIL. By leveraging paired spatial transcriptomics data, MIST constructs cross-modal prototypes that reorganize H&E-derived features along molecularly guided axes, thereby enhancing representation without requiring transcriptomic input during inference. The approach integrates spatial transcriptomics clustering, prototype anchoring, and frozen foundation model mapping. Evaluated across 23 downstream tasks and 8 MIL aggregators, MIST outperforms baselines in 240 out of 256 configurations, achieving an average improvement of 3.5% (including +5.2% in survival prediction, +3.3% in subtype classification, and +2.6% in biomarker prediction).
πŸ“ Abstract
Multiple instance learning (MIL) is the dominant framework for whole-slide image analysis in computational pathology, typically combining a frozen patch encoder, a projection layer, and a slide-level aggregator. While encoders and aggregators have been extensively studied, the projection layer remains a largely morphology-only bottleneck. This limits endpoints such as biomarker status and survival, which are governed by a molecular state that is not fully captured by H&E morphology. We introduce Molecularly Informed Staining Transform (MIST), a plug-in replacement for the MIL projection layer that uses paired spatial transcriptomics only during training to construct virtual molecular stains. MIST clusters gene expression profiles into cross-modal prototypes, anchors them in the frozen foundation model feature space, and uses them to reorganize H&E patch features along molecularly guided axes. It requires no transcriptomics at inference and can be inserted before standard MIL aggregators. We evaluate MIST across 23 downstream tasks and 8 MIL aggregators. MIST improves 240 of 256 configurations over the standard projection layer, with an average gain of +3.5%, observed consistently across endpoint types: +5.2% on survival prediction, +3.3% on tissue subtyping, and +2.6% on biomarker prediction. Ablations confirm that gene-derived prototypes are the primary source of the gains, while spatial, biological, and pathological analyses show that cross-modal prototype affinities capture spatially coherent molecular programs from H&E alone.
Problem

Research questions and friction points this paper is trying to address.

multiple instance learning
computational pathology
molecular state
H&E morphology
modality bottleneck
Innovation

Methods, ideas, or system contributions that make the work stand out.

virtual molecular staining
multiple instance learning
spatial transcriptomics
cross-modal prototypes
computational pathology
πŸ”Ž Similar Papers
No similar papers found.
Y
Yucheng Xing
National University of Singapore, Singapore
P
Pei Liu
Hunan University, China
J
Jingying Ma
National University of Singapore, Singapore
R
Ruping Hong
Peking Union Medical College Hospital (PUMCH), China
J
Jiangdong Qiu
Peking Union Medical College Hospital (PUMCH), China
T
Tianyu Liu
National University of Singapore, Singapore
Kai He
Kai He
National University of Singapore | NTU | XJTU
Large Language ModelAI for HealthcareAffective ComputingInformation Extraction
Ling Huang
Ling Huang
Imperial, NUS, UTC, CNRS, Sorbonne alliance
Uncertainty quantificationTrustworthy AIMedical data analysisCardiovascular computing
M
Mengling Feng
National University of Singapore, Singapore