SkinGPT-X: A Self-Evolving Collaborative Multi-Agent System for Transparent and Trustworthy Dermatological Diagnosis

📅 2026-03-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of monolithic large language models in diagnosing fine-grained, large-scale multi-class, and rare skin diseases—challenges arising from data sparsity and the lack of interpretability and traceability. To overcome these issues, we propose a multimodal collaborative multi-agent system that emulates the clinical workflow of dermatologists and introduces, for the first time, a self-evolving dermatological memory mechanism that transcends the constraints of static knowledge bases. By integrating multi-agent collaboration, multimodal alignment, and fine-grained classification, our approach achieves a 9.6% accuracy gain on DDI31 and a 13% improvement in weighted F1 score on Dermnet. It also significantly outperforms existing methods on a fine-grained dataset encompassing 498 disease classes and on a novel rare-disease benchmark comprising 564 samples across eight rare conditions, enabling transparent, trustworthy, and clinically viable dermatological diagnosis.
📝 Abstract
While recent advancements in Large Language Models have significantly advanced dermatological diagnosis, monolithic LLMs frequently struggle with fine-grained, large-scale multi-class diagnostic tasks and rare skin disease diagnosis owing to training data sparsity, while also lacking the interpretability and traceability essential for clinical reasoning. Although multi-agent systems can offer more transparent and explainable diagnostics, existing frameworks are primarily concentrated on Visual Question Answering and conversational tasks, and their heavy reliance on static knowledge bases restricts adaptability in complex real-world clinical settings. Here, we present SkinGPT-X, a multimodal collaborative multi-agent system for dermatological diagnosis integrated with a self-evolving dermatological memory mechanism. By simulating the diagnostic workflow of dermatologists and enabling continuous memory evolution, SkinGPT-X delivers transparent and trustworthy diagnostics for the management of complex and rare dermatological cases. To validate the robustness of SkinGPT-X, we design a three-tier comparative experiment. First, we benchmark SkinGPT-X against four state-of-the-art LLMs across four public datasets, demonstrating its state-of-the-art performance with a +9.6% accuracy improvement on DDI31 and +13% weighted F1 gain on Dermnet over the state-of-the-art model. Second, we construct a large-scale multi-class dataset covering 498 distinct dermatological categories to evaluate its fine-grained classification capabilities. Finally, we curate the rare skin disease dataset, the first benchmark to address the scarcity of clinical rare skin diseases which contains 564 clinical samples with eight rare dermatological diseases. On this dataset, SkinGPT-X achieves a +9.8% accuracy improvement, a +7.1% weighted F1 improvement, a +10% Cohen's Kappa improvement.
Problem

Research questions and friction points this paper is trying to address.

dermatological diagnosis
multi-agent system
rare skin disease
interpretability
fine-grained classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

self-evolving memory
collaborative multi-agent system
transparent diagnosis
rare skin disease
multimodal dermatological AI
🔎 Similar Papers
No similar papers found.
Z
Zhangtianyi Chen
School of Data Science, The Chinese University of Hong Kong, Shenzhen
Y
Yuhao Shen
School of Data Science, The Chinese University of Hong Kong, Shenzhen
F
Florensia Widjaja
School of Data Science, The Chinese University of Hong Kong, Shenzhen
Y
Yan Xu
Department of Dermatology, Tianjin Institute of Integrative Dermatology, Tianjin Academy of Traditional Chinese Medicine Affiliated Hospital, Tianjin 300120, China
L
Liyuan Sun
Department of Dermatology, Beijing AnZhen Hospital, Capital Medical University, Beijing 100029, China
Z
Zijian Wang
School of Data Science, The Chinese University of Hong Kong, Shenzhen
H
Hongyi Chen
School of Data Science, The Chinese University of Hong Kong, Shenzhen
W
Wufei Dai
School of Data Science, The Chinese University of Hong Kong, Shenzhen
Juexiao Zhou
Juexiao Zhou
Assistant Professor, The Chinese University of Hong Kong, Shenzhen
AI for HealthcareEthical AIBioinformaticsPrivacyAGI