Masked Omics Modeling for Multimodal Representation Learning across Histopathology and Molecular Profiles

📅 2025-08-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Hematoxylin and eosin (H&E)-stained histopathology images often inadequately capture underlying molecular features and clinical phenotypes. Method: We propose MORPHEUS, a Transformer-based multimodal self-supervised learning framework designed to bridge histopathology and high-dimensional omics data. Its core innovations include: (1) masked omics modeling—jointly encoding whole-slide images with transcriptomic, methylomic, and other omics modalities; and (2) a cross-modal shared latent space enabling bidirectional inference between H&E and omics, as well as flexible generation and prediction from arbitrary subsets of input modalities. Results: Pretrained on a large pan-cancer cohort, MORPHEUS significantly outperforms existing methods in molecular subtyping and clinical outcome prediction. It is the first framework to achieve interpretable, generalizable, and unified representation learning across histopathology and multimodal omics data.

Technology Category

Application Category

📝 Abstract
Self-supervised learning has driven major advances in computational pathology by enabling models to learn rich representations from hematoxylin and eosin (H&E)-stained cancer tissue. However, histopathology alone often falls short for molecular characterization and understanding clinical outcomes, as important information is contained in high-dimensional omics profiles like transcriptomics, methylomics, or genomics. In this work, we introduce MORPHEUS, a unified transformer-based pre-training framework that encodes both histopathology and multi-omics data into a shared latent space. At its core, MORPHEUS relies on a masked modeling objective applied to randomly selected omics portions, encouraging the model to learn biologically meaningful cross-modal relationships. The same pre-trained network can be applied to histopathology alone or in combination with any subset of omics modalities, seamlessly adapting to the available inputs. Additionally, MORPHEUS enables any-to-any omics generation, enabling one or more omics profiles to be inferred from any subset of modalities, including H&E alone. Pre-trained on a large pan-cancer cohort, MORPHEUS consistently outperforms state-of-the-art methods across diverse modality combinations and tasks, positioning itself as a promising framework for developing multimodal foundation models in oncology. The code is available at: https://github.com/Lucas-rbnt/MORPHEUS
Problem

Research questions and friction points this paper is trying to address.

Integrates histopathology and multi-omics data for cancer analysis
Learns cross-modal relationships via masked modeling of omics
Enables any-to-any omics generation from partial inputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based multimodal pre-training framework
Masked modeling for cross-modal learning
Any-to-any omics generation from histopathology
🔎 Similar Papers
No similar papers found.
Lucas Robinet
Lucas Robinet
Oncopole Claudius Regaud, IRT Saint-Exupéry
Multimodal Deep LearningOncology Research
Ahmad Berjaoui
Ahmad Berjaoui
IRT Saint Exupéry
E
Elizabeth Cohen-Jonathan Moyal
Oncopole Claudius Régaud, INSERM Cancer Research Center of Toulouse, Toulouse