DermaCon-IN: A Multi-concept Annotated Dermatological Image Dataset of Indian Skin Disorders for Clinical AI Research

📅 2025-06-06

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Existing dermatological AI models suffer from dataset biases—lacking representation of real-world outpatient settings, skin tone diversity, and non-Western populations. Method: We introduce the first outpatient-oriented, multi-concept skin image dataset for India (5,450+ images, 240+ diagnoses), annotated using an etiology-driven hierarchical labeling scheme grounded in the Rook classification system. This scheme systematically integrates anatomical location, clinical concepts, and diagnostic granularity to ensure clinical fidelity and localization relevance. We benchmark multiple architectures—including ResNet, DenseNet, EfficientNet, ViT, MaxViT, and Swin—and incorporate Concept Bottleneck Models (CBMs) for concept-guided learning. Contribution/Results: Concept-level supervision significantly enhances model interpretability and cross-population generalization. Our dataset establishes a new foundation for trustworthy, reproducible, and scalable clinical AI, while the integrated annotation framework and CBM evaluation provide a methodological paradigm for domain-adapted dermatological modeling.

Technology Category

Application Category

📝 Abstract

Artificial intelligence is poised to augment dermatological care by enabling scalable image-based diagnostics. Yet, the development of robust and equitable models remains hindered by datasets that fail to capture the clinical and demographic complexity of real-world practice. This complexity stems from region-specific disease distributions, wide variation in skin tones, and the underrepresentation of outpatient scenarios from non-Western populations. We introduce DermaCon-IN, a prospectively curated dermatology dataset comprising over 5,450 clinical images from approximately 3,000 patients across outpatient clinics in South India. Each image is annotated by board-certified dermatologists with over 240 distinct diagnoses, structured under a hierarchical, etiology-based taxonomy adapted from Rook's classification. The dataset captures a wide spectrum of dermatologic conditions and tonal variation commonly seen in Indian outpatient care. We benchmark a range of architectures including convolutional models (ResNet, DenseNet, EfficientNet), transformer-based models (ViT, MaxViT, Swin), and Concept Bottleneck Models to establish baseline performance and explore how anatomical and concept-level cues may be integrated. These results are intended to guide future efforts toward interpretable and clinically realistic models. DermaCon-IN provides a scalable and representative foundation for advancing dermatology AI in real-world settings.

Problem

Research questions and friction points this paper is trying to address.

Lack of diverse dermatological datasets for Indian skin disorders

Underrepresentation of non-Western populations in clinical AI research

Need for interpretable models in dermatology AI diagnostics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-concept annotated dermatological image dataset

Hierarchical etiology-based taxonomy for diagnoses

Benchmarked diverse AI architectures for performance

🔎 Similar Papers

SkinCaRe: A Multimodal Dermatology Dataset Annotated with Medical Caption and Chain-of-Thought Reasoning