🤖 AI Summary
Accurately predicting pathological complete response (pCR) following neoadjuvant therapy for non-small-cell lung cancer (NSCLC) remains challenging due to difficulties in effectively fusing imaging and clinical data and the limited interpretability of existing deep learning models.
Method: We propose a physician-in-the-loop multimodal deep learning framework. First, we introduce a novel clinical-knowledge-guided “physician-in-the-loop” mechanism, embedding radiologist-annotated lesion priors into model training. Second, we design a mid-level dynamic fusion architecture for imaging and clinical features, integrated with an interpretable attention module to enable progressive lesion focusing.
Contribution/Results: Evaluated on a public cohort, our framework significantly improves pCR prediction accuracy (+5.2%). It generates clinically interpretable saliency maps and transparent decision rationales. To our knowledge, this is the first reproducible, verifiable, and clinically deployable multimodal explainable AI paradigm for NSCLC neoadjuvant response assessment.
📝 Abstract
This study proposes a novel approach combining Multimodal Deep Learning with intrinsic eXplainable Artificial Intelligence techniques to predict pathological response in non-small cell lung cancer patients undergoing neoadjuvant therapy. Due to the limitations of existing radiomics and unimodal deep learning approaches, we introduce an intermediate fusion strategy that integrates imaging and clinical data, enabling efficient interaction between data modalities. The proposed Multimodal Doctor-in-the-Loop method further enhances clinical relevance by embedding clinicians' domain knowledge directly into the training process, guiding the model's focus gradually from broader lung regions to specific lesions. Results demonstrate improved predictive accuracy and explainability, providing insights into optimal data integration strategies for clinical applications.