Attention-Based Multimodal Survival Prediction with Cross-Modal Bilinear Fusion

πŸ“… 2026-05-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

192K/year
πŸ€– AI Summary
This study addresses survival prediction in patients with high-risk non-muscle-invasive bladder cancer (HR-NMIBC) by proposing a multimodal deep learning framework that integrates whole-slide histopathology images, RNA-seq expression profiles, and clinical variables. The method employs an attention-based multiple instance learning (ABMIL) module to extract pathological features, a feedforward network to encode molecular and clinical data, and introduces a novel low-rank bilinear fusion strategy to model pairwise conditional interactions across modalities in a parameter-efficient and structurally interpretable manner. The model outputs a continuous risk score calibrated via non-parametric Kaplan–Meier estimation. Evaluated on the CHIMERA challenge dataset, the proposed approach significantly outperforms concatenation-based baselines and demonstrates strong generalization performance on the hidden test cohort.
πŸ“ Abstract
We propose a novel multimodal deep learning framework for patient-level survival prediction, which integrates whole-slide histology features, RNA-seq expression profiles, and clinical variables. Our architecture combines an ABMIL module~\cite{ilse2018attention} for slide-level representation with feedforward encoders for RNA and clinical data. These embeddings are then integrated through low-rank bilinear cross-modal fusion~\cite{liu2018efficient} to model conditional interactions across modalities while controlling parameter growth. The model outputs continuous risk scores that are subsequently mapped to survival times using a nonparametric calibration procedure based on the Kaplan--Meier estimator~\cite{kaplan1958nonparametric}. By decomposing multimodal reasoning into independent pairwise interactions, the proposed fusion design promotes structural interpretability and parameter efficiency compared with full tensor and hierarchical fusion strategies. Experiments on the CHIMERA challenge dataset demonstrate improved predictive performance over concatenation-based baselines and competitive generalization on hidden evaluation cohorts. These results indicate that the proposed framework is a promising approach for multimodal survival prediction in HR-NMIBC. The implementation is publicly available at https://github.com/hassancpu/ChimeraChallenge2025_Task_3.
Problem

Research questions and friction points this paper is trying to address.

multimodal fusion
survival prediction
cross-modal interaction
HR-NMIBC
patient-level risk
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-modal bilinear fusion
attention-based MIL
multimodal survival prediction
low-rank fusion
nonparametric calibration
πŸ”Ž Similar Papers
No similar papers found.
πŸ’Ό Related Jobs
H
Hassan Keshvarikhojasteh
Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
J
Josien P. W. Pluim
Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Mitko Veta
Mitko Veta
Associate Professor, Eindhoven University of Technology
Medical Image AnalysisDigital PathologyMachine Learning