MACD: Multi-Agent Clinical Diagnosis with Self-Learned Knowledge for LLM

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of large language models (LLMs) in complex clinical diagnosis—namely, isolated reasoning and non-reusable experiential knowledge—this paper proposes a self-evolving multi-agent clinical diagnostic framework. The framework employs an iterative, human-in-the-loop process involving diagnostic agents, evaluation agents, and domain experts to enable autonomous clinical knowledge distillation and continuous knowledge accumulation—achieving cross-model generalizability, transferability, and personalization. Built upon open-source models including Llama-3.1 and DeepSeek-R1-Distill-Llama, the system supports traceable reasoning and human-AI collaborative decision-making. Experiments on 4,390 real-world cases spanning seven disease categories demonstrate that the framework achieves up to a 22.3% improvement in primary diagnosis accuracy over standard clinical guidelines, outperforms physician-only diagnosis by 16%, and delivers an 18.6% gain under human-AI collaboration—significantly enhancing diagnostic robustness and interpretability.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have demonstrated notable potential in medical applications, yet they face substantial challenges in handling complex real-world clinical diagnoses using conventional prompting methods. Current prompt engineering and multi-agent approaches typically optimize isolated inferences, neglecting the accumulation of reusable clinical experience. To address this, this study proposes a novel Multi-Agent Clinical Diagnosis (MACD) framework, which allows LLMs to self-learn clinical knowledge via a multi-agent pipeline that summarizes, refines, and applies diagnostic insights. It mirrors how physicians develop expertise through experience, enabling more focused and accurate diagnosis on key disease-specific cues. We further extend it to a MACD-human collaborative workflow, where multiple LLM-based diagnostician agents engage in iterative consultations, supported by an evaluator agent and human oversight for cases where agreement is not reached. Evaluated on 4,390 real-world patient cases across seven diseases using diverse open-source LLMs (Llama-3.1 8B/70B, DeepSeek-R1-Distill-Llama 70B), MACD significantly improves primary diagnostic accuracy, outperforming established clinical guidelines with gains up to 22.3% (MACD). On the subset of the data, it achieves performance on par with or exceeding that of human physicians (up to 16% improvement over physicians-only diagnosis). Additionally, on the MACD-human workflow, it achieves an 18.6% improvement compared to physicians-only diagnosis. Moreover, self-learned knowledge exhibits strong cross-model stability, transferability, and model-specific personalization, while the system can generate traceable rationales, enhancing explainability. Consequently, this work presents a scalable self-learning paradigm for LLM-assisted diagnosis, bridging the gap between the intrinsic knowledge of LLMs and real-world clinical practice.
Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with complex clinical diagnoses using standard prompting methods
Current approaches neglect accumulation of reusable clinical experience
Need to bridge gap between LLM knowledge and real-world clinical practice
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework for self-learning clinical knowledge
Iterative consultations with evaluator agent and human oversight
Generates traceable rationales for explainable diagnosis
🔎 Similar Papers
No similar papers found.
Wenliang Li
Wenliang Li
School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei Anhui, 230026, China
R
Rui Yan
School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei Anhui, 230026, China
X
Xu Zhang
School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei Anhui, 230026, China
L
Li Chen
Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei Anhui, 230001, China
H
Hongji Zhu
Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei Anhui, 230001, China
J
Jing Zhao
Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei Anhui, 230001, China
Junjun Li
Junjun Li
Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei Anhui, 230001, China
M
Mengru Li
Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei Anhui, 230001, China
W
Wei Cao
Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei Anhui, 230001, China
Zihang Jiang
Zihang Jiang
School of Biomedical Engineering, USTC, Suzhou Institute for Advanced Research
Computer VisionMedical Imaging3D
W
Wei Wei
Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei Anhui, 230001, China
K
Kun Zhang
School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei Anhui, 230026, China
Shaohua Kevin Zhou
Shaohua Kevin Zhou
Professor, USTC, FAIMBE, FIAMBE, FIEEE, FMICCAI, FNAI
Medical Image ComputingComputer Vision & Pattern RecognitionMachine & Deep Learning