🤖 AI Summary
This study addresses the challenge of multimodal survival prediction in non-small cell lung cancer, where missing data across CT imaging, whole-slide pathology images (WSI), and clinical records often hinder the clinical deployment of deep learning models. To overcome this limitation, the authors propose a missingness-aware multimodal survival prediction framework that leverages foundation models to extract modality-specific features and introduces a missingness-aware encoding mechanism. This design enables the model to adaptively utilize available information without discarding incomplete samples or resorting to imputation, while dynamically adjusting each modality’s contribution during intermediate fusion. Experimental results demonstrate that the proposed approach significantly outperforms both unimodal baselines and early/late fusion strategies under naturally occurring missing modalities, achieving a C-index of 73.30 when fusing WSI and clinical data, thereby confirming its effectiveness and robustness.
📝 Abstract
Accurate survival prediction in Non-Small Cell Lung Cancer (NSCLC) requires the integration of heterogeneous clinical, radiological, and histopathological information. While Multimodal Deep Learning (MDL) offers a promises for precision prognosis and survival prediction, its clinical applicability is severely limited by small cohort sizes and the presence of missing modalities, often forcing complete-case filtering or aggressive imputation. In this work, we present a missing-aware multimodal survival framework that integrates Computed Tomography (CT), Whole-Slide Histopathology (WSI) Images, and structured clinical variables for overall survival modeling in unresectable stage II-III NSCLC. By leveraging Foundation Models (FM) for modality-specific feature extraction and a missing-aware encoding strategy, the proposed approach enables intermediate multimodal fusion under naturally incomplete modality profiles. The proposed architecture is resilient to missing modalities by design, allowing the model to utilize all available data without being forced to drop patients during training or inference. Experimental results demonstrate that intermediate fusion consistently outperforms unimodal baselines as well as early and late fusion strategies, with the strongest performance achieved by the fusion of WSI and clinical modalities (73.30 C-index). Further analyses of modality importance reveal an adaptive behavior in which less informative modalities, i.e., CT modality, are automatically down-weighted and contribute less to the final survival prediction.