A Flow Model with Low-Rank Transformers for Incomplete Multimodal Survival Analysis

📅 2025-10-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-world multimodal survival analysis frequently encounters missing modality issues; existing approaches neglect inter-modal distributional discrepancies, leading to inconsistent reconstructions and poor generalization. To address this, we propose a low-rank Transformer–normalizing flow fusion framework. First, we design a class-conditional normalizing flow module to align cross-modal distributions and construct a distributionally consistent latent space. Second, we incorporate a low-rank Transformer to model intra-modal long-range dependencies, enhancing stability in high-dimensional multimodal fusion. This is the first work to integrate normalizing flows with low-rank Transformers for incomplete multimodal survival analysis, effectively mitigating distribution shift and overfitting. Extensive experiments demonstrate that our method achieves state-of-the-art performance under both complete and incomplete modality settings, significantly improving the robustness and accuracy of survival prediction.

Technology Category

Application Category

📝 Abstract
In recent years, multimodal medical data-based survival analysis has attracted much attention. However, real-world datasets often suffer from the problem of incomplete modality, where some patient modality information is missing due to acquisition limitations or system failures. Existing methods typically infer missing modalities directly from observed ones using deep neural networks, but they often ignore the distributional discrepancy across modalities, resulting in inconsistent and unreliable modality reconstruction. To address these challenges, we propose a novel framework that combines a low-rank Transformer with a flow-based generative model for robust and flexible multimodal survival prediction. Specifically, we first formulate the concerned problem as incomplete multimodal survival analysis using the multi-instance representation of whole slide images (WSIs) and genomic profiles. To realize incomplete multimodal survival analysis, we propose a class-specific flow for cross-modal distribution alignment. Under the condition of class labels, we model and transform the cross-modal distribution. By virtue of the reversible structure and accurate density modeling capabilities of the normalizing flow model, the model can effectively construct a distribution-consistent latent space of the missing modality, thereby improving the consistency between the reconstructed data and the true distribution. Finally, we design a lightweight Transformer architecture to model intra-modal dependencies while alleviating the overfitting problem in high-dimensional modality fusion by virtue of the low-rank Transformer. Extensive experiments have demonstrated that our method not only achieves state-of-the-art performance under complete modality settings, but also maintains robust and superior accuracy under the incomplete modalities scenario.
Problem

Research questions and friction points this paper is trying to address.

Handles incomplete multimodal medical survival data
Aligns cross-modal distributions using flow models
Uses low-rank Transformers for robust modality fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow model aligns cross-modal distributions for consistency
Low-rank Transformer reduces overfitting in modality fusion
Reversible flow structure reconstructs missing modality data
🔎 Similar Papers
No similar papers found.
Y
Yi Yin
College of Computer and Mathematics, Central South University of Forestry and Technology, Changsha, Hunan, 410004, China
Y
Yuntao Shou
College of Computer and Mathematics, Central South University of Forestry and Technology, Changsha, Hunan, 410004, China
Z
Zao Dai
School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, 710049, China
Y
Yun Peng
Tao Meng
Tao Meng
Central South University of Forestry and Technology
Graph Neural NetworkMultimodal Emotion RecognitionText ClassificationEntity Alignment
W
Wei Ai
College of Computer and Mathematics, Central South University of Forestry and Technology, Changsha, Hunan, 410004, China
Keqin Li
Keqin Li
AMA University
RoboticMachine learningArtificial intelligenceComputer vision