🤖 AI Summary
Existing pan-cancer prognostic models struggle to effectively integrate histopathological images, clinical notes, and genomic data, resulting in poor representation generalizability and low data utilization efficiency. To address this, we propose the Multi-Expert Collaborative Embedding (MICE) framework, which employs functionally heterogeneous expert modules to jointly model cross-cancer commonalities and cancer-type-specific features. MICE integrates contrastive learning with supervised learning to achieve multimodal representation alignment and discriminative optimization. Evaluated on 30 cancer types from TCGA and other cohorts comprising 11,799 patients, MICE achieves C-index improvements of 3.8–11.2% on internal validation cohorts and 5.8–8.8% on independent external cohorts. It significantly enhances cross-institutional generalizability and robustness under limited-sample settings. This work establishes a scalable, multimodal, heterogeneous-data-driven paradigm for precision oncology prognosis.
📝 Abstract
Multimodal data provides heterogeneous information for a holistic understanding of the tumor microenvironment. However, existing AI models often struggle to harness the rich information within multimodal data and extract poorly generalizable representations. Here we present MICE (Multimodal data Integration via Collaborative Experts), a multimodal foundation model that effectively integrates pathology images, clinical reports, and genomics data for precise pan-cancer prognosis prediction. Instead of conventional multi-expert modules, MICE employs multiple functionally diverse experts to comprehensively capture both cross-cancer and cancer-specific insights. Leveraging data from 11,799 patients across 30 cancer types, we enhanced MICE's generalizability by coupling contrastive and supervised learning. MICE outperformed both unimodal and state-of-the-art multi-expert-based multimodal models, demonstrating substantial improvements in C-index ranging from 3.8% to 11.2% on internal cohorts and 5.8% to 8.8% on independent cohorts, respectively. Moreover, it exhibited remarkable data efficiency across diverse clinical scenarios. With its enhanced generalizability and data efficiency, MICE establishes an effective and scalable foundation for pan-cancer prognosis prediction, holding strong potential to personalize tailored therapies and improve treatment outcomes.