Towards Conversational AI for Disease Management

📅 2025-03-08

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This study addresses the limited capability of large language models (LLMs) in end-to-end chronic disease management—including disease progression modeling, multi-visit clinical reasoning, treatment response assessment, and safe medication decision-making—by proposing the first multi-turn clinical reasoning framework explicitly incorporating longitudinal disease evolution. Methodologically, it leverages Gemini’s extended context window to integrate context-aware retrieval, structured clinical reasoning, and domain-specific knowledge from NICE/BMJ clinical guidelines alongside US and UK pharmacopeias, enabling dynamic, evidence-informed diagnostic and therapeutic simulation. In a blinded virtual Objective Structured Clinical Examination (OSCE), the system demonstrated non-inferiority to 21 general practitioners in overall management decision-making across 100 multi-visit cases (p < 0.05), achieved higher accuracy in treatment and investigation recommendations, and significantly outperformed clinicians on high-difficulty RxQA medication safety questions. This work provides the first empirical evidence that LLMs can surpass human experts in guideline adherence and medication safety.

Technology Category

Application Category

📝 Abstract

While large language models (LLMs) have shown promise in diagnostic dialogue, their capabilities for effective management reasoning - including disease progression, therapeutic response, and safe medication prescription - remain under-explored. We advance the previously demonstrated diagnostic capabilities of the Articulate Medical Intelligence Explorer (AMIE) through a new LLM-based agentic system optimised for clinical management and dialogue, incorporating reasoning over the evolution of disease and multiple patient visit encounters, response to therapy, and professional competence in medication prescription. To ground its reasoning in authoritative clinical knowledge, AMIE leverages Gemini's long-context capabilities, combining in-context retrieval with structured reasoning to align its output with relevant and up-to-date clinical practice guidelines and drug formularies. In a randomized, blinded virtual Objective Structured Clinical Examination (OSCE) study, AMIE was compared to 21 primary care physicians (PCPs) across 100 multi-visit case scenarios designed to reflect UK NICE Guidance and BMJ Best Practice guidelines. AMIE was non-inferior to PCPs in management reasoning as assessed by specialist physicians and scored better in both preciseness of treatments and investigations, and in its alignment with and grounding of management plans in clinical guidelines. To benchmark medication reasoning, we developed RxQA, a multiple-choice question benchmark derived from two national drug formularies (US, UK) and validated by board-certified pharmacists. While AMIE and PCPs both benefited from the ability to access external drug information, AMIE outperformed PCPs on higher difficulty questions. While further research would be needed before real-world translation, AMIE's strong performance across evaluations marks a significant step towards conversational AI as a tool in disease management.

Problem

Research questions and friction points this paper is trying to address.

Explores LLMs for disease progression and therapy response management.

Develops AMIE for clinical dialogue and medication prescription reasoning.

Compares AMIE with physicians in clinical management and medication accuracy.

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agentic system for clinical management

Combines in-context retrieval with structured reasoning

Outperforms primary care physicians in management reasoning

🔎 Similar Papers

No similar papers found.