MUST: Modality-Specific Representation-Aware Transformer for Diffusion-Enhanced Survival Prediction with Missing Modality

📅 2026-03-27

📈 Citations: 0

✨ Influential: 0

career value

138K/year

🤖 AI Summary

This work addresses the significant performance degradation in survival prediction caused by missing modalities in multimodal medical data, a challenge exacerbated by the inability of existing methods to distinguish modality-specific information from cross-modal inferable content. To this end, the authors propose MUST, a novel framework that explicitly decomposes each modality’s representation in a low-rank shared subspace into modality-specific and cross-modal contextual components. MUST further introduces a conditional latent diffusion model guided by algebraic constraints and structural priors to accurately impute non-inferable missing modality-specific information. Integrated with a Transformer architecture, MUST achieves state-of-the-art performance across five TCGA cancer datasets, maintaining high prediction accuracy and clinically acceptable inference latency even when key modalities such as histopathology or genomics are missing.

Technology Category

Application Category

📝 Abstract

Accurate survival prediction from multimodal medical data is essential for precision oncology, yet clinical deployment faces a persistent challenge: modalities are frequently incomplete due to cost constraints, technical limitations, or retrospective data availability. While recent methods attempt to address missing modalities through feature alignment or joint distribution learning, they fundamentally lack explicit modeling of the unique contributions of each modality as opposed to the information derivable from other modalities. We propose MUST (Modality-Specific representation-aware Transformer), a novel framework that explicitly decomposes each modality's representation into modality-specific and cross-modal contextualized components through algebraic constraints in a learned low-rank shared subspace. This decomposition enables precise identification of what information is lost when a modality is absent. For the truly modality-specific information that cannot be inferred from available modalities, we employ conditional latent diffusion models to generate high-quality representations conditioned on recovered shared information and learned structural priors. Extensive experiments on five TCGA cancer datasets demonstrate that MUST achieves state-of-the-art performance with complete data while maintaining robust predictions in both missing pathology and missing genomics conditions, with clinically acceptable inference latency.

Problem

Research questions and friction points this paper is trying to address.

survival prediction

missing modality

multimodal medical data

precision oncology

incomplete data

Innovation

Methods, ideas, or system contributions that make the work stand out.

modality-specific representation

conditional latent diffusion

low-rank shared subspace