🤖 AI Summary
Current medical AI systems operate in isolation, lacking coordinated validation between diagnosis and medication prescription, thereby limiting clinical utility. To address this, we propose a confidence-aware multi-agent collaboration framework comprising specialized physician and pharmacist agents. These agents engage in confidence-driven dynamic interaction to emulate real-world clinical team knowledge integration and cross-validation. Our approach enables end-to-end interpretable and highly consistent diagnostic–therapeutic reasoning for the first time. We further introduce DrugCareQA, a novel benchmark designed to evaluate integrated clinical decision-making performance. Experimental results demonstrate that our framework achieves diagnostic and medication recommendation accuracies of 67.58%, outperforming single-agent baselines by +7.04% and +7.08%, respectively. It exhibits strong generalizability and clinical applicability in both telemedicine and routine outpatient settings.
📝 Abstract
Autonomous agents utilizing Large Language Models (LLMs) have demonstrated remarkable capabilities in isolated medical tasks like diagnosis and image analysis, but struggle with integrated clinical workflows that connect diagnostic reasoning and medication decisions. We identify a core limitation: existing medical AI systems process tasks in isolation without the cross-validation and knowledge integration found in clinical teams, reducing their effectiveness in real-world healthcare scenarios. To transform the isolation paradigm into a collaborative approach, we propose MedCoAct, a confidence-aware multi-agent framework that simulates clinical collaboration by integrating specialized doctor and pharmacist agents, and present a benchmark, DrugCareQA, to evaluate medical AI capabilities in integrated diagnosis and treatment workflows. Our results demonstrate that MedCoAct achieves 67.58% diagnostic accuracy and 67.58% medication recommendation accuracy, outperforming single agent framework by 7.04% and 7.08% respectively. This collaborative approach generalizes well across diverse medical domains, proving especially effective for telemedicine consultations and routine clinical scenarios, while providing interpretable decision-making pathways.