Linking spatial biology and clinical histology via Haiku

📅 2026-04-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

168K/year
🤖 AI Summary
This study addresses the limited integration of molecular, morphological, and clinical data in spatial biology by proposing Haiku, the first contrastive learning framework that aligns spatial proteomics (26.7 million image patches), H&E histomorphology, and clinical metadata within a unified embedding space. Haiku enables cross-modal retrieval, clinical outcome prediction, and zero-shot biomarker inference. A key innovation is the incorporation of a counterfactual prediction mechanism grounded in clinical metadata, which uncovers dynamic molecular changes within disease microenvironments. Experimental results demonstrate that Haiku significantly outperforms existing baselines in cross-modal retrieval (Recall@50 = 0.611), survival prediction (C-index = 0.737, a 7.91% improvement), and zero-shot inference (mean Pearson correlation = 0.718).
📝 Abstract
Integrating molecular, morphological, and clinical data is essential for basic and translational biomedical research, yet systematic frameworks for jointly modeling these modalities remain limited. Here we present Haiku, a tri-modal contrastive learning model trained on multiplexed immunofluorescence (mIF). It comprises 26.7 million spatial proteomics patches from 3,218 tissue sections across 1,606 patients spanning 11 organ types, with matched hematoxylin and eosin (H&E) histology and clinical metadata aligned in a shared embedding space. Haiku enables three-way cross-modal retrieval, improves downstream classification and clinical prediction tasks over unimodal baselines, and supports zero-shot biomarker inference through fusion retrieval conditioned on clinical metadata-only text descriptions. Across tasks, Haiku outperforms competing approaches, achieving cross-modal retrieval (Recall@50 up to 0.611 versus near-zero baseline), survival prediction (C-index 0.737, +7.91% relative improvement), and zero-shot biomarker inference (mean Pearson correlation 0.718 across 52 biomarkers). Furthermore, we introduce a counterfactual prediction framework in which modifying only clinical metadata while fixing tissue morphology surfaces niche-specific molecular shifts associated with breast cancer stage progression and lung cancer survival outcomes. In a lung adenocarcinoma case study, the counterfactual analysis recovers niche-specific shifts characterized by increased CD8 and granzyme B, reduced PD-L1, and decreased Ki67, broadly consistent with patterns reported for favorable outcomes. We present these counterfactual results as exploratory, hypothesis-generating signals rather than mechanistic claims. These capabilities demonstrate that tri-modal alignment via Haiku enables integrative analysis of spatial biology, bridging molecular measurements with clinical context for biological exploration.
Problem

Research questions and friction points this paper is trying to address.

spatial biology
multimodal integration
clinical histology
molecular data
clinical metadata
Innovation

Methods, ideas, or system contributions that make the work stand out.

tri-modal contrastive learning
spatial proteomics
cross-modal retrieval
zero-shot biomarker inference
counterfactual prediction
🔎 Similar Papers
No similar papers found.
Y
Yan Cui
Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA
J
Jacob S. Leiby
Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA
Wenhui Lei
Wenhui Lei
University of Pennsylvania
AI4HealthArtifical Intelligence
Dokyoon Kim
Dokyoon Kim
Associate Professor of Informatics at University of Pennsylvania
Biomedical InformaticsBioinformaticsSystems BiologyData IntegrationPrecision Medicine
Y
Yanxiang Deng
Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA
A
Aaron T. Mayer
Enable Medicine, Menlo Park, CA, USA
Z
Zhenqin Wu
Enable Medicine, Menlo Park, CA, USA
A
Alexandro E. Trevino
Enable Medicine, Menlo Park, CA, USA
Zhi Huang
Zhi Huang
Assistant Professor, University of Pennsylvania
Biomedical Data ScienceAIComputational Pathology