🤖 AI Summary
Early detection of mild cognitive impairment (MCI) remains clinically challenging due to its subtle, nonspecific symptoms and reliance on subjective clinical assessment. To address this, we propose a task-driven, clinical semantic parsing multi-agent framework built upon LLaMA-3 8B, enabling fully automated, dynamic collaborative analysis and structured interpretation of real-world clinical notes. Our key contributions are: (1) the first multi-agent architecture explicitly designed for clinical text, supporting dynamic task routing and few-step prompt optimization to overcome the generalization limitations of monolithic models; and (2) balanced high precision and perfect specificity. Evaluated on 3,338 real clinical notes, the system achieves F1 = 0.90 and specificity = 1.00, maintaining perfect specificity on the validation set. Moreover, its inference efficiency surpasses that of human experts, demonstrating strong feasibility for clinical deployment.
📝 Abstract
Early identification of cognitive concerns is critical but often hindered by subtle symptom presentation. This study developed and validated a fully automated, multi-agent AI workflow using LLaMA 3 8B to identify cognitive concerns in 3,338 clinical notes from Mass General Brigham. The agentic workflow, leveraging task-specific agents that dynamically collaborate to extract meaningful insights from clinical notes, was compared to an expert-driven benchmark. Both workflows achieved high classification performance, with F1-scores of 0.90 and 0.91, respectively. The agentic workflow demonstrated improved specificity (1.00) and achieved prompt refinement in fewer iterations. Although both workflows showed reduced performance on validation data, the agentic workflow maintained perfect specificity. These findings highlight the potential of fully automated multi-agent AI workflows to achieve expert-level accuracy with greater efficiency, offering a scalable and cost-effective solution for detecting cognitive concerns in clinical settings.