Establishing dermatopathology encyclopedia DermpathNet with Artificial Intelligence-Based Workflow.

📅 2026-01-27
🏛️ Scientific Data
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the scarcity of high-quality, openly accessible image datasets in dermatopathology, which hinders both clinical education and machine learning research. To overcome this limitation, the authors propose a hybrid AI workflow that integrates deep learning–based image classification with textual analysis of figure captions to automatically retrieve, filter, and annotate dermatopathological images from PubMed Central. The pipeline incorporates expert review to establish a semi-automated dataset curation process, resulting in the release of DermpathNet—an open-access dataset comprising 7,772 images spanning 166 diagnostic categories. The hybrid filtering approach achieves an F-score of 90.4%, demonstrating high precision and recall. Beyond providing a valuable resource for educational and algorithmic development purposes, this work also highlights the performance limitations of general-purpose AI models in the specialized domain of dermatopathology.

Technology Category

Application Category

📝 Abstract
Accessing high-quality, open-access dermatopathology image datasets for learning and cross-referencing is a common challenge for clinicians and trainees. To establish a comprehensive open-access dermatopathology dataset for educational, cross-referencing, and machine-learning purposes, we employed a hybrid workflow to curate and categorize images from the PubMed Central (PMC) repository. We used specific keywords to extract relevant images, and classified them using a novel hybrid method that combined deep learning-based image modality classification with figure caption analyses. Validation on 651 manually annotated images demonstrated the robustness of our workflow, with an F-score of 89.6% for the deep learning approach, 61.0% for the keyword-based retrieval method, and 90.4% for the hybrid approach. We retrieved over 7,772 images across 166 diagnoses and released this fully annotated dataset, reviewed by board-certified dermatopathologists. Using our dataset as a challenging task, we found the current image analysis algorithm from OpenAI inadequate for analyzing dermatopathology images. In conclusion, we have developed a large, peer-reviewed, open-access dermatopathology image dataset, DermpathNet, which features a semi-automated curation workflow.
Problem

Research questions and friction points this paper is trying to address.

dermatopathology
image dataset
open-access
AI-based workflow
medical education
Innovation

Methods, ideas, or system contributions that make the work stand out.

dermatopathology
hybrid curation workflow
deep learning
image-text analysis
open-access dataset
🔎 Similar Papers
No similar papers found.
Ziyang Xu
Ziyang Xu
The Chinese University of Hong Kong
AI for ScienceBioinformaticsMedical Image Processing
Mingquan Lin
Mingquan Lin
Assistant Professor at University of Minnesota
Medical image analysisDeep learning
Yiliang Zhou
Yiliang Zhou
University of California, Irvine
NLPAI in healthcareLLM
Zihan Xu
Zihan Xu
Arizona State University
Machine LearningNeuromorphic ComputingMemory
S
S. Orlow
Division of Dermatopathology, Mount Sinai Health, New York, USA
S
Shane A. Meehan
Division of Dermatopathology, Mount Sinai Health, New York, USA
A
A. Flamm
Perelman Department of Dermatology, NYU Grossman School of Medicine, New York, USA
A
A. Moshiri
Perelman Department of Dermatology, NYU Grossman School of Medicine, New York, USA
Yifan Peng
Yifan Peng
Associate Professor at Weill Cornell Medicine
NLPCVmachine learning