Doctor Approved: Generating Medically Accurate Skin Disease Images through AI-Expert Feedback

📅 2025-06-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical imaging data are scarce and expensive to annotate, resulting in synthetically generated images from diffusion models that often lack clinical credibility—thereby limiting the generalizability of diagnostic models. To address this, we propose MAGIC, an AI-expert collaborative framework that introduces a multimodal large language model (MLLM) as a scalable, automated medical evaluator, translating expert clinical criteria into executable image optimization feedback. MAGIC integrates knowledge-guided feedback distillation with conditional diffusion-based synthesis to generate high-fidelity skin lesion images at low annotation cost. Clinical evaluation by dermatologists confirms high clinical credibility of the synthesized images. In a 20-class skin disease classification task, MAGIC improves overall accuracy by 9.02%; under few-shot settings, the improvement reaches 13.89%. This work bridges the gap between generative modeling and clinical utility through human-in-the-loop, knowledge-infused synthesis.

Technology Category

Application Category

📝 Abstract
Paucity of medical data severely limits the generalizability of diagnostic ML models, as the full spectrum of disease variability can not be represented by a small clinical dataset. To address this, diffusion models (DMs) have been considered as a promising avenue for synthetic image generation and augmentation. However, they frequently produce medically inaccurate images, deteriorating the model performance. Expert domain knowledge is critical for synthesizing images that correctly encode clinical information, especially when data is scarce and quality outweighs quantity. Existing approaches for incorporating human feedback, such as reinforcement learning (RL) and Direct Preference Optimization (DPO), rely on robust reward functions or demand labor-intensive expert evaluations. Recent progress in Multimodal Large Language Models (MLLMs) reveals their strong visual reasoning capabilities, making them adept candidates as evaluators. In this work, we propose a novel framework, coined MAGIC (Medically Accurate Generation of Images through AI-Expert Collaboration), that synthesizes clinically accurate skin disease images for data augmentation. Our method creatively translates expert-defined criteria into actionable feedback for image synthesis of DMs, significantly improving clinical accuracy while reducing the direct human workload. Experiments demonstrate that our method greatly improves the clinical quality of synthesized skin disease images, with outputs aligning with dermatologist assessments. Additionally, augmenting training data with these synthesized images improves diagnostic accuracy by +9.02% on a challenging 20-condition skin disease classification task, and by +13.89% in the few-shot setting.
Problem

Research questions and friction points this paper is trying to address.

Addressing medical data scarcity for ML diagnostic models
Improving accuracy of AI-generated skin disease images
Reducing expert workload in synthetic image validation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models for synthetic image generation
Incorporates expert feedback via MLLM evaluators
Improves diagnostic accuracy with AI-augmented data
🔎 Similar Papers
No similar papers found.