Benchmarking Foundation Models for Renal Lesion Stratification in CT

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

192K/year
🤖 AI Summary
This study presents the first systematic evaluation of the transferability of open-source medical foundation models on a six-class CT renal lesion classification task under data-scarce conditions. To address generalization challenges arising from limited training data, the authors employ a frozen feature extraction strategy, fine-tuning three foundation models on a composite dataset of 2,854 cases and evaluating them on an external test set of 234 cases. Performance is benchmarked against handcrafted radiomics and a 3D ResNet-50 trained from scratch. Results show that the foundation models achieve AUCs between 0.70 and 0.77—comparable to ResNet-50 but with lower computational cost—while radiomics significantly outperforms all deep learning approaches with an AUC of 0.88 (p ≤ 0.002), highlighting current limitations of general-purpose foundation models in capturing the subtle textural and morphological heterogeneity characteristic of renal lesions.
📝 Abstract
The rapid proliferation of open-source medical foundation models (FMs) raises a practical question: how well do their pre-trained representations transfer to clinically relevant but data-scarce classification tasks? Particularly in CT-based renal lesion classification, a push toward greater generalizability would be meaningful, as the field is constrained by inherently limited training data. We addressed this through a benchmark of three medical FMs on this specific task. This six-class problem spans common entities like cysts and clear cell renal cell carcinoma, alongside rare subtypes. Using a frozen feature-probing protocol, we compared FM embeddings against a handcrafted radiomics classifier and a 3D ResNet-50 trained from scratch. Models were trained on a composite dataset of 2,854 lesions and evaluated on an external test set of 234 lesions from The Cancer Imaging Archive. Our results reveal two key findings. First, FM performance (AUC 0.70-0.77) matched the from-scratch ResNet (AUC 0.72) while drastically reducing hardware demand, requiring only seconds on a CPU after feature extraction. However, the conventional radiomics baseline significantly outperformed all deep learning approaches, achieving an AUC of 0.88 (all p $\leq$ 0.002). This suggests that current generalist FM embeddings do not yet capture the fine-grained texture and shape heterogeneity driving histological subtype discrimination. Despite their potential in data-scarce settings, medical FMs did not surpass established models for renal lesion stratification, leaving radiomics as the current state-of-the-art.
Problem

Research questions and friction points this paper is trying to address.

foundation models
renal lesion stratification
CT imaging
data-scarce classification
medical AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

foundation models
renal lesion stratification
radiomics
feature probing
CT imaging
🔎 Similar Papers
H
Hartmut Häntze
Charité - Universitätsmedizin Berlin, Department of Radiology, Berlin, Germany
S
Sarah de Boer
Radboudumc, Diagnostic Image Analysis Group, Nijmegen, The Netherlands
M
Myrthe Buser
Radboudumc, Diagnostic Image Analysis Group, Nijmegen, The Netherlands
Alessa Hering
Alessa Hering
Radboud University Medical Center
Deep LearningImage RegistrationTumor Follow-UpLLM
Bram van Ginneken
Bram van Ginneken
Professor of Medical Image Analysis, Radboud University
Medical Image AnalysisMedical ImagingDeep LearningComputer-Aided Diagnosis
Mathias Prokop
Mathias Prokop
Professor of Radiology, Radboudumc
Computed tomographycomputer aided diagnosislung cancerstroke
J
Jawed Nawabi
Charité - Universitätsmedizin Berlin, Institute of Neuroradiology, Berlin, Germany
S
Sebastian Ziegelmayer
Klinikum rechts der Isar, TUM University Hospital, Technical University of Munich, Munich, Germany
Lisa Adams
Lisa Adams
Assistant Professor of Radiology | Technical University Munich
RadiologyAIMolecular MRI
Keno Bressem
Keno Bressem
Technical University Munich
deep learningradiomicsmicrowave ablation