Orchestrator Multi-Agent Clinical Decision Support System for Secondary Headache Diagnosis in Primary Care

📅 2025-12-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Early identification of secondary headaches in primary care is hindered by time constraints, incomplete clinical information, and high symptom heterogeneity, leading to frequent misdiagnosis or delayed diagnosis. To address these challenges, this paper proposes a scheduler-expert collaborative multi-agent clinical decision support system. The diagnostic task is decomposed into seven domain-specific agents, each leveraging open-source large language models (e.g., Qwen, Llama) and integrating dual prompting mechanisms—guideline-driven (GPrompt) and question-driven (QPrompt)—to enable structured, evidence-based reasoning with full traceability. This architecture markedly enhances diagnostic interpretability and clinical adaptability. Evaluated on 90 expert-annotated cases, GPrompt improves average F1-score by 12.3% over baseline; gains are especially pronounced for smaller models, consistently outperforming single-model approaches.

Technology Category

Application Category

📝 Abstract
Unlike most primary headaches, secondary headaches need specialized care and can have devastating consequences if not treated promptly. Clinical guidelines highlight several 'red flag' features, such as thunderclap onset, meningismus, papilledema, focal neurologic deficits, signs of temporal arteritis, systemic illness, and the 'worst headache of their life' presentation. Despite these guidelines, determining which patients require urgent evaluation remains challenging in primary care settings. Clinicians often work with limited time, incomplete information, and diverse symptom presentations, which can lead to under-recognition and inappropriate care. We present a large language model (LLM)-based multi-agent clinical decision support system built on an orchestrator-specialist architecture, designed to perform explicit and interpretable secondary headache diagnosis from free-text clinical vignettes. The multi-agent system decomposes diagnosis into seven domain-specialized agents, each producing a structured and evidence-grounded rationale, while a central orchestrator performs task decomposition and coordinates agent routing. We evaluated the multi-agent system using 90 expert-validated secondary headache cases and compared its performance with a single-LLM baseline across two prompting strategies: question-based prompting (QPrompt) and clinical practice guideline-based prompting (GPrompt). We tested five open-source LLMs (Qwen-30B, GPT-OSS-20B, Qwen-14B, Qwen-8B, and Llama-3.1-8B), and found that the orchestrated multi-agent system with GPrompt consistently achieved the highest F1 scores, with larger gains in smaller models. These findings demonstrate that structured multi-agent reasoning improves accuracy beyond prompt engineering alone and offers a transparent, clinically aligned approach for explainable decision support in secondary headache diagnosis.
Problem

Research questions and friction points this paper is trying to address.

Develops a multi-agent system for diagnosing secondary headaches in primary care
Addresses challenges of limited time and incomplete information in diagnosis
Improves accuracy over single models via structured, interpretable reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent system with orchestrator-specialist architecture for diagnosis
Decomposes diagnosis into seven domain-specialized agents for structured reasoning
Uses clinical guideline-based prompting to improve accuracy and transparency
🔎 Similar Papers
No similar papers found.
X
Xizhi Wu
Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA, USA
N
Nelly Estefanie Garduno-Rapp
Clinical Informatics Center, UT Southwestern Medical Center, Dallas, TX, USA
Justin F Rousseau
Justin F Rousseau
Associate Professor of Neurology, University of Texas Southwestern Medical Center
Clinical InformaticsNeurologyNatural Language ProcessingClinical Decision Support
M
Mounika Thakkallapally
Clinical Informatics Center, UT Southwestern Medical Center, Dallas, TX, USA
H
Hang Zhang
Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA
Yuelyu Ji
Yuelyu Ji
University of Pittsburgh
Natural language processingHealth information detectionLarge language model evaluation
Shyam Visweswaran
Shyam Visweswaran
Professor of Biomedical Informatics, University of Pittsburgh
artificial intelligencemachine learningbiomedical informaticsclinical decision support
Y
Yifan Peng
Population Health Sciences, Weill Cornell Medicine, New York, NY, USA; Institute of Artificial Intelligence for Digital Health, Weill Cornell Medicine, New York, NY USA
Y
Yanshan Wang
Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA, USA; Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA; Institute of Artificial Intelligence for Digital Health, Weill Cornell Medicine, New York, NY USA; Clinical and Translational Science Institute, University of Pittsburgh, Pittsburgh, PA, USA