Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoder

📅 2025-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of insufficient robustness in multimodal cancer survival prediction due to frequent genomic data missingness in clinical practice. To this end, we propose LD-CVAE—a latent-differentiated conditional variational autoencoder framework. First, we design LD-VAE to jointly learn compact, discriminative representations of whole-slide images (WSIs) and functionally specific genomic embeddings. Second, we introduce an information-bottleneck Transformer (VIB-Trans) that explicitly disentangles genomic features by biological function during conditional generation. The method integrates variational inference, conditional generative modeling, and a product-of-experts (PoE) multimodal fusion mechanism. Extensive evaluation across five cancer cohorts demonstrates that LD-CVAE significantly outperforms state-of-the-art methods under genomic missingness, while achieving new SOTA performance with complete modalities. To our knowledge, it is the first framework enabling both high robustness to missing genomic data and functionally interpretable cross-modal survival prediction.

Technology Category

Application Category

📝 Abstract
The integrative analysis of histopathological images and genomic data has received increasing attention for survival prediction of human cancers. However, the existing studies always hold the assumption that full modalities are available. As a matter of fact, the cost for collecting genomic data is high, which sometimes makes genomic data unavailable in testing samples. A common way of tackling such incompleteness is to generate the genomic representations from the pathology images. Nevertheless, such strategy still faces the following two challenges: (1) The gigapixel whole slide images (WSIs) are huge and thus hard for representation. (2) It is difficult to generate the genomic embeddings with diverse function categories in a unified generative framework. To address the above challenges, we propose a Conditional Latent Differentiation Variational AutoEncoder (LD-CVAE) for robust multimodal survival prediction, even with missing genomic data. Specifically, a Variational Information Bottleneck Transformer (VIB-Trans) module is proposed to learn compressed pathological representations from the gigapixel WSIs. To generate different functional genomic features, we develop a novel Latent Differentiation Variational AutoEncoder (LD-VAE) to learn the common and specific posteriors for the genomic embeddings with diverse functions. Finally, we use the product-of-experts technique to integrate the genomic common posterior and image posterior for the joint latent distribution estimation in LD-CVAE. We test the effectiveness of our method on five different cancer datasets, and the experimental results demonstrate its superiority in both complete and missing modality scenarios.
Problem

Research questions and friction points this paper is trying to address.

Predict cancer survival using histopathological images and genomic data.
Handle missing genomic data by generating representations from pathology images.
Develop a unified framework for diverse functional genomic feature generation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Conditional Latent Differentiation Variational AutoEncoder
Implements Variational Information Bottleneck Transformer
Applies product-of-experts for joint latent distribution
🔎 Similar Papers
No similar papers found.
Junjie Zhou
Junjie Zhou
Nanjing University
Computer VisionMachine Learning
J
Jiao Tang
The College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, The Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education
Y
Ying Zuo
The College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, The Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education
Peng Wan
Peng Wan
Nanjing University of Aeronautics and Astronautics
Daoqiang Zhang
Daoqiang Zhang
Nanjing University of Aeronautics and Astronautics
Machine learningpattern recognitionmedical image analysisdata mining
W
Wei Shao
The College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, The Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education