Is an Ultra Large Natural Image-Based Foundation Model Superior to a Retina-Specific Model for Detecting Ocular and Systemic Diseases?

📅 2025-02-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The clinical applicability of general-purpose versus domain-specific foundation models in ophthalmic disease detection and systemic disease prediction remains unclear. Method: We systematically evaluated DINOv2 (a general-purpose vision foundation model) and RETFound (a retinal-specialized model) across eight open-source fundus image datasets, AlzEye, and UK Biobank, using supervised fine-tuning and AUROC assessment, complemented by statistical significance testing and robustness evaluation under 10% few-shot settings. Contribution/Results: We report the first empirical evidence that DINOv2-large achieves superior performance in diabetic retinopathy detection (AUROC: 0.850–0.952), significantly outperforming RETFound; conversely, RETFound excels in predicting cardiovascular diseases—including heart failure—with higher AUROC (0.732–0.796). These findings establish a “task-oriented foundation model selection paradigm,” providing empirically grounded, methodologically rigorous guidance for selecting appropriate vision foundation models in medical imaging applications.

Technology Category

Application Category

📝 Abstract
The advent of foundation models (FMs) is transforming medical domain. In ophthalmology, RETFound, a retina-specific FM pre-trained sequentially on 1.4 million natural images and 1.6 million retinal images, has demonstrated high adaptability across clinical applications. Conversely, DINOv2, a general-purpose vision FM pre-trained on 142 million natural images, has shown promise in non-medical domains. However, its applicability to clinical tasks remains underexplored. To address this, we conducted head-to-head evaluations by fine-tuning RETFound and three DINOv2 models (large, base, small) for ocular disease detection and systemic disease prediction tasks, across eight standardized open-source ocular datasets, as well as the Moorfields AlzEye and the UK Biobank datasets. DINOv2-large model outperformed RETFound in detecting diabetic retinopathy (AUROC=0.850-0.952 vs 0.823-0.944, across three datasets, all P<=0.007) and multi-class eye diseases (AUROC=0.892 vs. 0.846, P<0.001). In glaucoma, DINOv2-base model outperformed RETFound (AUROC=0.958 vs 0.940, P<0.001). Conversely, RETFound achieved superior performance over all DINOv2 models in predicting heart failure, myocardial infarction, and ischaemic stroke (AUROC=0.732-0.796 vs 0.663-0.771, all P<0.001). These trends persisted even with 10% of the fine-tuning data. These findings showcase the distinct scenarios where general-purpose and domain-specific FMs excel, highlighting the importance of aligning FM selection with task-specific requirements to optimise clinical performance.
Problem

Research questions and friction points this paper is trying to address.

Compare general-purpose and retina-specific foundation models
Evaluate models for ocular and systemic disease detection
Determine optimal model selection for clinical tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning general-purpose vision models
Comparing retina-specific and natural image models
Evaluating models on multiple disease datasets
🔎 Similar Papers
No similar papers found.
Qingshan Hou
Qingshan Hou
Northeastern University; National University of Singapore
medical image analysisfoundation modeldeep learning
Y
Yukun Zhou
Centre for Medical Image Computing, University College London, London, United Kingdom; NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK; Department of Medical Physics and Biomedical Engineering, University College London, London, United Kingdom
J
Jocelyn Hui Lin Goh
Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
Ke Zou
Ke Zou
Apple, Inc
Power electronicsSwitched-capacitor ConverterPower Semiconductor Devices
S
Samantha Min Er Yew
Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
S
Sahana Srinivasan
Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
M
Meng Wang
Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
T
Thaddaeus Lo
Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
X
Xiaofeng Lei
Institute of High-Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
S
Siegfried K. Wagner
NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
M
Mark A. Chia
NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
D
Dawei Yang
Department of Ophthalmology and Visual Sciences, Chinese University of Hong Kong, Hong Kong, China
H
Hongyang Jiang
Department of Ophthalmology and Visual Sciences, Chinese University of Hong Kong, Hong Kong, China
A
AnRan Ran
Department of Ophthalmology and Visual Sciences, Chinese University of Hong Kong, Hong Kong, China
Rui Santos
Rui Santos
Department of Ophthalmology, Stadtspital Zürich, Zurich, Switzerland
G
Gabor Mark Somfai
Department of Ophthalmology, Stadtspital Zürich, Zurich, Switzerland
Juan Helen Zhou
Juan Helen Zhou
Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
H
Haoyu Chen
Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
Qingyu Chen
Qingyu Chen
Biomedical Informatics & Data Science, Yale University; NCBI-NLM, National Institutes of Health
Text miningMachine learningData curationBioNLPMedical Imaging Analysis
C
Carol Yim-Lui Cheung
Department of Ophthalmology and Visual Sciences, Chinese University of Hong Kong, Hong Kong, China
P
Pearse A. Keane
NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
Yih Chung Tham
Yih Chung Tham
Yong Loo Lin School of Medicine, National University of Singapore; Singapore Eye Research Institute
OphthalmologyEpidemiologyVisual ImpairmentDeep Learning