DermaSynth: Rich Synthetic Image-Text Pairs Using Open Access Dermatology Datasets

📅 2025-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Dermatology has long been hindered by a scarcity of high-quality, aligned image–text pairs, impeding the development of vision-language foundation models. To address this, we introduce DermaSynth: the first open-source, CC-BY-4.0-compliant synthetic dermatological image–text dataset comprising 92,020 pairs, covering both clinical photographs and dermoscopic images. Our method innovatively integrates metadata from public dermatology datasets with a Gemini 2.0–driven, hallucination-mitigated self-instruction text generation paradigm, enhanced via metadata-augmented prompt engineering. Leveraging DermaSynth, we fine-tune Llama-3.2-11B-Vision-Instruct to release DermatoLlama 1.0—a lightweight, task-optimized vision-language model. Rigorous evaluation on a 5,000-sample benchmark demonstrates strong performance across diagnostic and descriptive tasks. All code, prompts, and dataset artifacts are publicly released, establishing a reproducible, foundational resource for multimodal dermatological modeling.

Technology Category

Application Category

📝 Abstract
A major barrier to developing vision large language models (LLMs) in dermatology is the lack of large image--text pairs dataset. We introduce DermaSynth, a dataset comprising of 92,020 synthetic image--text pairs curated from 45,205 images (13,568 clinical and 35,561 dermatoscopic) for dermatology-related clinical tasks. Leveraging state-of-the-art LLMs, using Gemini 2.0, we used clinically related prompts and self-instruct method to generate diverse and rich synthetic texts. Metadata of the datasets were incorporated into the input prompts by targeting to reduce potential hallucinations. The resulting dataset builds upon open access dermatological image repositories (DERM12345, BCN20000, PAD-UFES-20, SCIN, and HIBA) that have permissive CC-BY-4.0 licenses. We also fine-tuned a preliminary Llama-3.2-11B-Vision-Instruct model, DermatoLlama 1.0, on 5,000 samples. We anticipate this dataset to support and accelerate AI research in dermatology. Data and code underlying this work are accessible at https://github.com/abdurrahimyilmaz/DermaSynth.
Problem

Research questions and friction points this paper is trying to address.

Dermatology
Image-Text Datasets
Large Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

DermaSynth
Gemini 2.0
Clinical Knowledge-Guided Generation
🔎 Similar Papers
No similar papers found.
Abdurrahim Yilmaz
Abdurrahim Yilmaz
Imperial College London
Deep LearningAI for DermatologyMicrorobotics
F
Furkan Yuceyalcin
Yildiz Technical University
E
Ece Gokyayla
Usak Research and Training Hospital
Donghee Choi
Donghee Choi
Assistant Professor at Pusan National University
BioNLPClinical NLPAI DietitianFinance AI
O
Ozan Erdem Ali Anil Demircali
Istanbul Medeniyet University
Rahmetullah Varol
Rahmetullah Varol
Universität der Bundeswehr München
Artificial IntelligenceRobotics
U
Ufuk Gorkem Kirabali
Yildiz Technical University
G
G. Gencoglan
Istanbul Medicana Atakoy Hospital
J
J. Posma
Imperial College London
Burak Temelkuran
Burak Temelkuran
Imperial College London