From Classical Machine Learning to Emerging Foundation Models: Review on Multimodal Data Integration for Cancer Research

๐Ÿ“… 2025-07-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study addresses key challenges in cancer precision medicineโ€”namely, the difficulty of integrating heterogeneous multimodal data (genomic, proteomic, imaging, and clinical), limited model interpretability, and poor generalizability. We systematically formulate an evolutionary framework spanning traditional machine learning to large language models (LLMs) and multimodal foundation models (MFMs). A unified multimodal fusion paradigm is proposed, supporting core oncology tasks including tumor subtype classification, biomarker discovery, treatment response prediction, and prognosis assessment. Leveraging publicly available multi-omics and imaging datasets, we empirically validate critical techniques: cross-modal alignment, joint representation learning, and interpretability enhancement. Notably, we present the first comprehensive migration landscape of foundation models in oncology, identifying current bottlenecks and actionable development pathways. This work establishes a systematic methodology for building generalizable, clinically trustworthy, AI-driven precision oncology solutions.

Technology Category

Application Category

๐Ÿ“ Abstract
Cancer research is increasingly driven by the integration of diverse data modalities, spanning from genomics and proteomics to imaging and clinical factors. However, extracting actionable insights from these vast and heterogeneous datasets remains a key challenge. The rise of foundation models (FMs) -- large deep-learning models pretrained on extensive amounts of data serving as a backbone for a wide range of downstream tasks -- offers new avenues for discovering biomarkers, improving diagnosis, and personalizing treatment. This paper presents a comprehensive review of widely adopted integration strategies of multimodal data to assist advance the computational approaches for data-driven discoveries in oncology. We examine emerging trends in machine learning (ML) and deep learning (DL), including methodological frameworks, validation protocols, and open-source resources targeting cancer subtype classification, biomarker discovery, treatment guidance, and outcome prediction. This study also comprehensively covers the shift from traditional ML to FMs for multimodal integration. We present a holistic view of recent FMs advancements and challenges faced during the integration of multi-omics with advanced imaging data. We identify the state-of-the-art FMs, publicly available multi-modal repositories, and advanced tools and methods for data integration. We argue that current state-of-the-art integrative methods provide the essential groundwork for developing the next generation of large-scale, pre-trained models poised to further revolutionize oncology. To the best of our knowledge, this is the first review to systematically map the transition from conventional ML to advanced FM for multimodal data integration in oncology, while also framing these developments as foundational for the forthcoming era of large-scale AI models in cancer research.
Problem

Research questions and friction points this paper is trying to address.

Integrating diverse cancer data modalities for actionable insights
Transitioning from classical ML to foundation models in oncology
Addressing challenges in multi-omics and imaging data integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Foundation models integrate multi-omics and imaging data
Transition from classical ML to advanced deep learning
Comprehensive review of multimodal data integration strategies
๐Ÿ”Ž Similar Papers
No similar papers found.
A
Amgad Muneer
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
M
Muhammad Waqas
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
M
Maliazurina B Saad
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
E
Eman Showkatian
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
R
Rukhmini Bandyopadhyay
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
H
Hui Xu
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
W
Wentao Li
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
J
Joe Y Chang
Department of Thoracic Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Z
Zhongxing Liao
Department of Thoracic Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
C
Cara Haymaker
Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
L
Luisa Solis Soto
Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
C
Carol C Wu
Department of Thoracic Imaging, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
N
Natalie I Vokes
Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
X
Xiuning Le
Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
L
Lauren A Byers
Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
D
Don L Gibbons
Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
J
John V Heymach
Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Jianjun Zhang
Jianjun Zhang
South China University of Technology
machine learningneural networks
J
Jia Wu
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.; Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.