Doctor Approved: Generating Medically Accurate Skin Disease Images through AI-Expert Feedback

📅 2025-06-14

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Medical imaging data are scarce and expensive to annotate, resulting in synthetically generated images from diffusion models that often lack clinical credibility—thereby limiting the generalizability of diagnostic models. To address this, we propose MAGIC, an AI-expert collaborative framework that introduces a multimodal large language model (MLLM) as a scalable, automated medical evaluator, translating expert clinical criteria into executable image optimization feedback. MAGIC integrates knowledge-guided feedback distillation with conditional diffusion-based synthesis to generate high-fidelity skin lesion images at low annotation cost. Clinical evaluation by dermatologists confirms high clinical credibility of the synthesized images. In a 20-class skin disease classification task, MAGIC improves overall accuracy by 9.02%; under few-shot settings, the improvement reaches 13.89%. This work bridges the gap between generative modeling and clinical utility through human-in-the-loop, knowledge-infused synthesis.

Technology Category

Application Category

📝 Abstract

Paucity of medical data severely limits the generalizability of diagnostic ML models, as the full spectrum of disease variability can not be represented by a small clinical dataset. To address this, diffusion models (DMs) have been considered as a promising avenue for synthetic image generation and augmentation. However, they frequently produce medically inaccurate images, deteriorating the model performance. Expert domain knowledge is critical for synthesizing images that correctly encode clinical information, especially when data is scarce and quality outweighs quantity. Existing approaches for incorporating human feedback, such as reinforcement learning (RL) and Direct Preference Optimization (DPO), rely on robust reward functions or demand labor-intensive expert evaluations. Recent progress in Multimodal Large Language Models (MLLMs) reveals their strong visual reasoning capabilities, making them adept candidates as evaluators. In this work, we propose a novel framework, coined MAGIC (Medically Accurate Generation of Images through AI-Expert Collaboration), that synthesizes clinically accurate skin disease images for data augmentation. Our method creatively translates expert-defined criteria into actionable feedback for image synthesis of DMs, significantly improving clinical accuracy while reducing the direct human workload. Experiments demonstrate that our method greatly improves the clinical quality of synthesized skin disease images, with outputs aligning with dermatologist assessments. Additionally, augmenting training data with these synthesized images improves diagnostic accuracy by +9.02% on a challenging 20-condition skin disease classification task, and by +13.89% in the few-shot setting.

Problem

Research questions and friction points this paper is trying to address.

Addressing medical data scarcity for ML diagnostic models

Improving accuracy of AI-generated skin disease images

Reducing expert workload in synthetic image validation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models for synthetic image generation

Incorporates expert feedback via MLLM evaluators

Improves diagnostic accuracy with AI-augmented data

🔎 Similar Papers

Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM Empowerment

2024-09-14arXiv.orgCitations: 0

Bosch Group

Renningen, BW, DE

PhD – Generative Models for Closed-loop Synthesis

Bosch Group

Renningen, BW, DE

AI Research Scientist, VLM (vision language models)