Learning from Limited and Incomplete Data: A Multimodal Framework for Predicting Pathological Response in NSCLC

📅 2026-03-16

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This study addresses the challenge of preoperative prediction of major pathological response (pR) to neoadjuvant therapy in non-small cell lung cancer, which is hindered by sparse and frequently missing clinical data. To overcome this limitation, the authors propose a multimodal deep learning framework that leverages foundation models to extract features from CT imaging and introduces a missingness-aware neural network to directly model incomplete clinical variables, thereby circumventing conventional imputation strategies. A learnable weighted fusion mechanism is employed to effectively integrate information from multiple sources. Evaluated in real-world clinical settings with limited sample sizes, the proposed method significantly outperforms unimodal baselines, demonstrating the efficacy and robustness of the missingness-aware architecture and multimodal fusion strategy.

Technology Category

Application Category

📝 Abstract

Major pathological response (pR) following neoadjuvant therapy is a clinically meaningful endpoint in non-small cell lung cancer, strongly associated with improved survival. However, accurate preoperative prediction of pR remains challenging, particularly in real-world clinical settings characterized by limited data availability and incomplete clinical profiles. In this study, we propose a multimodal deep learning framework designed to address these constraints by integrating foundation model-based CT feature extraction with a missing-aware architecture for clinical variables. This approach enables robust learning from small cohorts while explicitly modeling missing clinical information, without relying on conventional imputation strategies. A weighted fusion mechanism is employed to leverage the complementary contributions of imaging and clinical modalities, yielding a multimodal model that consistently outperforms both unimodal imaging and clinical baselines. These findings underscore the added value of integrating heterogeneous data sources and highlight the potential of multimodal, missing-aware systems to support pR prediction under realistic clinical conditions.

Problem

Research questions and friction points this paper is trying to address.

pathological response

non-small cell lung cancer

limited data

incomplete clinical data

preoperative prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal learning

missing-aware architecture

foundation model