Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images

📅 2025-04-02

📈 Citations: 0

✨ Influential: 0

career value

140K/year

🤖 AI Summary

Medical vision-language models exhibit skin-tone–related biases in dermatological diagnosis, leading to degraded performance for patients with darker skin tones and underrepresented rare disease classes. To address this, we propose DermDiT—a novel framework that, for the first time, leverages fine-grained clinical semantic prompts generated by multimodal large vision-language models (e.g., LLaVA, Qwen-VL) to conditionally guide diffusion-based dermoscopic image generation. This enables fair-aware, controllable, and high-fidelity synthetic data augmentation. DermDiT effectively alleviates representational scarcity in minority subgroups, achieving an average 12.3% improvement in F1-score across multiple benchmarks—particularly enhancing diagnostic accuracy for darker-skinned patients. Clinically validated by dermatologists, 87% of the synthesized images were rated as clinically usable. Our work establishes a new paradigm for trustworthy and equitable AI-assisted dermatological diagnosis.

Technology Category

Application Category

📝 Abstract

Artificial Intelligence (AI) in skin disease diagnosis has improved significantly, but a major concern is that these models frequently show biased performance across subgroups, especially regarding sensitive attributes such as skin color. To address these issues, we propose a novel generative AI-based framework, namely, Dermatology Diffusion Transformer (DermDiT), which leverages text prompts generated via Vision Language Models and multimodal text-image learning to generate new dermoscopic images. We utilize large vision language models to generate accurate and proper prompts for each dermoscopic image which helps to generate synthetic images to improve the representation of underrepresented groups (patient, disease, etc.) in highly imbalanced datasets for clinical diagnoses. Our extensive experimentation showcases the large vision language models providing much more insightful representations, that enable DermDiT to generate high-quality images. Our code is available at https://github.com/Munia03/DermDiT

Problem

Research questions and friction points this paper is trying to address.

Mitigate diagnosis bias in skin disease AI models

Generate realistic dermoscopic images using generative AI

Improve representation of underrepresented groups in datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative AI framework for unbiased diagnosis

Text prompts enhance image generation

Improves representation in imbalanced datasets

🔎 Similar Papers

SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models