A Vision-Language Foundation Model for Zero-shot Clinical Collaboration and Automated Concept Discovery in Dermatology

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that medical foundation models often rely on task-specific fine-tuning and struggle to generalize in zero-shot clinical settings. To overcome this, we propose DermFM-Zero—the first vision-language foundation model capable of zero-shot dermatological diagnosis and multimodal retrieval without any fine-tuning. By integrating masked latent modeling, contrastive learning, and sparse autoencoders, the model is trained on over 4 million multimodal dermatological samples, yielding latent representations that automatically uncover interpretable clinical concepts while suppressing artifact-related biases. DermFM-Zero achieves state-of-the-art performance across 20 benchmarks. In a multinational study involving more than 1,100 physicians, AI-assisted general practitioners nearly doubled their diagnostic accuracy and outperformed board-certified dermatologists in skin cancer assessment.

Technology Category

Application Category

📝 Abstract
Medical foundation models have shown promise in controlled benchmarks, yet widespread deployment remains hindered by reliance on task-specific fine-tuning. Here, we introduce DermFM-Zero, a dermatology vision-language foundation model trained via masked latent modelling and contrastive learning on over 4 million multimodal data points. We evaluated DermFM-Zero across 20 benchmarks spanning zero-shot diagnosis and multimodal retrieval, achieving state-of-the-art performance without task-specific adaptation. We further evaluated its zero-shot capabilities in three multinational reader studies involving over 1,100 clinicians. In primary care settings, AI assistance enabled general practitioners to nearly double their differential diagnostic accuracy across 98 skin conditions. In specialist settings, the model significantly outperformed board-certified dermatologists in multimodal skin cancer assessment. In collaborative workflows, AI assistance enabled non-experts to surpass unassisted experts while improving management appropriateness. Finally, we show that DermFM-Zero's latent representations are interpretable: sparse autoencoders unsupervisedly disentangle clinically meaningful concepts that outperform predefined-vocabulary approaches and enable targeted suppression of artifact-induced biases, enhancing robustness without retraining. These findings demonstrate that a foundation model can provide effective, safe, and transparent zero-shot clinical decision support.
Problem

Research questions and friction points this paper is trying to address.

zero-shot clinical collaboration
automated concept discovery
vision-language foundation model
dermatology
medical foundation models
Innovation

Methods, ideas, or system contributions that make the work stand out.

vision-language foundation model
zero-shot clinical collaboration
interpretable latent representations
contrastive learning
automated concept discovery
🔎 Similar Papers
No similar papers found.
Siyuan Yan
Siyuan Yan
Research Fellow@Monash University
AI for MedicineFoundation Model
X
Xieji Li
AIM for Health Lab, Faculty of Information Technology, Monash University, Melbourne, Australia
D
Dan Mo
AIM for Health Lab, Faculty of Information Technology, Monash University, Melbourne, Australia
Philipp Tschandl
Philipp Tschandl
Medical University of Vienna
Y
Yiwen Jiang
AIM for Health Lab, Faculty of Information Technology, Monash University, Melbourne, Australia
Z
Zhonghua Wang
AIM for Health Lab, Faculty of Information Technology, Monash University, Melbourne, Australia
Ming Hu
Ming Hu
Monash University | Shanghai AI Laboratory
Lie Ju
Lie Ju
University College London; Moorfields Eye Hospital; Monash University
Computer VisionMedical Image AnalysisOphthalmology
C
Cristina Vico-Alonso
Dermatology Department. Fundacion Hospital 12 de Octubre. Madrid, Spain
Yizhen Zheng
Yizhen Zheng
PhD candidate, Monash University
AI4Drug DiscoveryLLMsGNNs
J
Jiahe Liu
AIM for Health Lab, Faculty of Information Technology, Monash University, Melbourne, Australia
Juexiao Zhou
Juexiao Zhou
Assistant Professor, The Chinese University of Hong Kong, Shenzhen
AI for HealthcareEthical AIBioinformaticsPrivacyAGI
C
Camilla Chello
Frazer Institute, The University of Queensland, Dermatology Research Centre, Brisbane, Australia
J
Jen G. Cheung
Victorian Melanoma Service, Alfred Care Group, Bayside Health, Melbourne, Australia
J
Julien Anriot
Claude Bernard Lyon-1 University, Lyon, France
L
Luc Thomas
Dermatology department, Hôpital Lyon Sud, Hospices Civils de Lyon, Lyons, France
Clare Primiero
Clare Primiero
The University of Queensland
G
Gin Tan
eResearch Centre, Monash University, Melbourne, Australia
A
Aik Beng Ng
NVIDIA AI Technology Center, Singapore
Simon See
Simon See
nvidia
applied mathematicsAImachine learningHigh Performance ComputingSimulation
X
Xiaoying Tang
Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
A
Albert Ip
Surrey Hills Medical Centre, Melbourne, Australia
X
Xiaoyang Liao
General Practice Ward/International Medical Center Ward, General Practice Medical Center, West China Hospital, Sichuan University, 610041, Chengdu, Sichuan, China
A
Adrian Bowling
Independent Researcher, Melbourne, Australia
M
Martin Haskett
Independent Researcher, Melbourne, Australia