Genome-Anchored Foundation Model Embeddings Improve Molecular Prediction from Histology Images

📅 2025-06-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A key challenge in precision oncology is predicting complex molecular features and patient prognosis directly from routine whole-slide images (WSIs), bypassing costly and time-consuming genomic assays. To address this, we propose PathLUPI—the first method to leverage transcriptomic data as “privileged information” (LUPI paradigm) during training, thereby constructing a genome-anchored histopathological embedding space. Crucially, PathLUPI enables accurate WSI-driven inference of molecular phenotypes and survival risk without requiring molecular data at test time. Built upon a multiple-instance learning framework, it integrates self-supervised contrastive learning to refine histologic representations. Evaluated across 20 cohorts comprising 11,257 samples and 49 prediction tasks, PathLUPI achieves AUC ≥ 0.80 on 14 biomarker prediction tasks and C-index ≥ 0.70 on survival prediction for five cancer types. This advances interpretable, clinically deployable computational pathology.

Technology Category

Application Category

📝 Abstract
Precision oncology requires accurate molecular insights, yet obtaining these directly from genomics is costly and time-consuming for broad clinical use. Predicting complex molecular features and patient prognosis directly from routine whole-slide images (WSI) remains a major challenge for current deep learning methods. Here we introduce PathLUPI, which uses transcriptomic privileged information during training to extract genome-anchored histological embeddings, enabling effective molecular prediction using only WSIs at inference. Through extensive evaluation across 49 molecular oncology tasks using 11,257 cases among 20 cohorts, PathLUPI demonstrated superior performance compared to conventional methods trained solely on WSIs. Crucially, it achieves AUC $geq$ 0.80 in 14 of the biomarker prediction and molecular subtyping tasks and C-index $geq$ 0.70 in survival cohorts of 5 major cancer types. Moreover, PathLUPI embeddings reveal distinct cellular morphological signatures associated with specific genotypes and related biological pathways within WSIs. By effectively encoding molecular context to refine WSI representations, PathLUPI overcomes a key limitation of existing models and offers a novel strategy to bridge molecular insights with routine pathology workflows for wider clinical application.
Problem

Research questions and friction points this paper is trying to address.

Predict molecular features from histology images efficiently
Improve accuracy in oncology tasks using genome-anchored embeddings
Bridge molecular insights with routine pathology workflows
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses transcriptomic privileged information for training
Generates genome-anchored histological embeddings from WSIs
Improves molecular prediction accuracy in oncology tasks
🔎 Similar Papers
No similar papers found.
C
Cheng Jin
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
Fengtao Zhou
Fengtao Zhou
Hong Kong University of Science and Technology
Multimodal LearningComputational Pathology
Y
Yunfang Yu
Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Department of Medical Oncology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
J
Jiabo Ma
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
Y
Yihui Wang
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
Yingxue Xu
Yingxue Xu
The Hong Kong University of Science and Technology
Multimodal LearningSurvival AnalysisComputational Pathology
Huajun Zhou
Huajun Zhou
The Hong Kong University of Science and Technology
Computer VisionMedical Image Processing
H
Hao Jiang
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
L
Luyang Luo
Department of Biomedical Informatics, Harvard University, Boston, USA
L
Luhui Mao
Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Department of Medical Oncology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
Zifan He
Zifan He
University of California - Los Angeles
FPGAHPCMachine Learning
X
Xiuming Zhang
Department of Pathology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
J
Jing Zhang
Department of Pathology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
R
Ronald Chan
Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Hong Kong SAR, China
H
Herui Yao
Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Department of Medical Oncology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
H
Hao Chen
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China; Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China; Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China; HKUST Shenzhen-Hong Kong Collaborative Innovation Research Institute, Shenzhen, China; State Key Laboratory of Nervous System Disorders, The Hong Kong University of