SkinGenBench: Generative Model and Preprocessing Effects for Synthetic Dermoscopic Augmentation in Melanoma Diagnosis

📅 2025-12-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The impact of preprocessing complexity and generative model selection on the quality of synthetic dermoscopic images and downstream melanoma diagnosis remains poorly understood. Method: We introduce SkinGenBench—the first benchmark dedicated to dermoscopic image synthesis—and systematically compare StyleGAN2-ADA and DDPM models, evaluating geometric augmentation and artifact removal preprocessing strategies. Contribution/Results: Generative architecture choice dominates both image fidelity (FID ≈ 65.5; KID ≈ 0.05) and diagnostic utility, outweighing preprocessing effects; excessive artifact removal degrades clinically critical textural cues. Integrating synthetic data boosts ViT-B/16 performance to F1 = 0.88 and ROC-AUC = 0.98—improving baseline scores by 8–15 percentage points. This work establishes, for the first time, the central role of generative model design in medical image synthesis and provides a methodological foundation for trustworthy AI-assisted skin cancer diagnosis.

Technology Category

Application Category

📝 Abstract
This work introduces SkinGenBench, a systematic biomedical imaging benchmark that investigates how preprocessing complexity interacts with generative model choice for synthetic dermoscopic image augmentation and downstream melanoma diagnosis. Using a curated dataset of 14,116 dermoscopic images from HAM10000 and MILK10K across five lesion classes, we evaluate the two representative generative paradigms: StyleGAN2-ADA and Denoising Diffusion Probabilistic Models (DDPMs) under basic geometric augmentation and advanced artifact removal pipelines. Synthetic melanoma images are assessed using established perceptual and distributional metrics (FID, KID, IS), feature space analysis, and their impact on diagnostic performance across five downstream classifiers. Experimental results demonstrate that generative architecture choice has a stronger influence on both image fidelity and diagnostic utility than preprocessing complexity. StyleGAN2-ADA consistently produced synthetic images more closely aligned with real data distributions, achieving the lowest FID (~65.5) and KID (~0.05), while diffusion models generated higher variance samples at the cost of reduces perceptual fidelity and class anchoring. Advanced artifact removal yielded only marginal improvements in generative metrics and provided limited downstream diagnostic gains, suggesting possible suppression of clinically relevant texture cues. In contrast, synthetic data augmentation substantially improved melanoma detection with 8-15% absolute gains in melanoma F1-score, and ViT-B/16 achieving F1~0.88 and ROC-AUC~0.98, representing an improvement of approximately 14% over non-augmented baselines. Our code can be found at https://github.com/adarsh-crafts/SkinGenBench
Problem

Research questions and friction points this paper is trying to address.

Evaluates generative models for synthetic dermoscopic image augmentation
Assesses preprocessing impact on melanoma diagnostic performance
Compares StyleGAN2-ADA and diffusion models using perceptual metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic benchmark compares StyleGAN2-ADA and DDPMs
Evaluates preprocessing effects on synthetic dermoscopic image generation
Shows generative model choice impacts fidelity more than preprocessing
🔎 Similar Papers
No similar papers found.
N
N. A. Adarsh Pritam
Dept. of Advanced Computing, Alliance University, Bangalore, Karnataka 560087, India
J
Jeba Shiney O
Dept. of Advanced Computing, Alliance University, Bangalore, Karnataka 560087, India
Sanyam Jain
Sanyam Jain
PhD Aarhus University
GenAI