DischargeSim: A Simulation Benchmark for Educational Doctor-Patient Communication at Discharge

📅 2025-09-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM benchmarks predominantly focus on in-hospital diagnostic reasoning, neglecting post-discharge patient education—a critical component of care continuity. Method: We introduce the first systematic benchmark for discharge communication, featuring multi-turn, personalized dialogues between DoctorAgent and PatientAgent to simulate diverse clinical scenarios and patient profiles. Our evaluation framework incorporates structured health document generation, AHRQ guideline compliance checking, LLM-as-judge assessment, and multiple-choice comprehension testing to quantify performance across multiple dimensions. Contribution/Results: Experiments on 18 state-of-the-art LLMs reveal no significant positive correlation between model scale and educational effectiveness, exposing a fundamental trade-off between content prioritization and communicative strategy application. These findings highlight structural limitations of current LLMs in delivering personalized, clinically appropriate discharge instructions—underscoring the need for task-specific architectural and training innovations.

Technology Category

Application Category

📝 Abstract
Discharge communication is a critical yet underexplored component of patient care, where the goal shifts from diagnosis to education. While recent large language model (LLM) benchmarks emphasize in-visit diagnostic reasoning, they fail to evaluate models' ability to support patients after the visit. We introduce DischargeSim, a novel benchmark that evaluates LLMs on their ability to act as personalized discharge educators. DischargeSim simulates post-visit, multi-turn conversations between LLM-driven DoctorAgents and PatientAgents with diverse psychosocial profiles (e.g., health literacy, education, emotion). Interactions are structured across six clinically grounded discharge topics and assessed along three axes: (1) dialogue quality via automatic and LLM-as-judge evaluation, (2) personalized document generation including free-text summaries and structured AHRQ checklists, and (3) patient comprehension through a downstream multiple-choice exam. Experiments across 18 LLMs reveal significant gaps in discharge education capability, with performance varying widely across patient profiles. Notably, model size does not always yield better education outcomes, highlighting trade-offs in strategy use and content prioritization. DischargeSim offers a first step toward benchmarking LLMs in post-visit clinical education and promoting equitable, personalized patient support.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' ability to support post-visit patient education
Assessing personalized discharge communication across diverse patient profiles
Benchmarking clinical education capabilities beyond diagnostic reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulates multi-turn doctor-patient discharge conversations
Evaluates LLMs on personalized document generation
Assesses patient comprehension via multiple-choice exams
🔎 Similar Papers
No similar papers found.
Zonghai Yao
Zonghai Yao
Umass Amherst
Medical-LLMMulti-agent AI HospitalClinical ReasoningSynthetic DataPatient Education
M
Michael Sun
Manning College of Information and Computer Sciences, UMass Amherst, MA, USA; Miner School of Computer and Information Sciences, UMass Lowell, MA, USA
Won Seok Jang
Won Seok Jang
PhD student, University of Massachusetts Lowell
natural language processinghealthcare
Sunjae Kwon
Sunjae Kwon
Umass Amherst
Machine LearningNatural Language ProcessingLexical SemanticsPublic HealthAi in Healthcare
S
Soie Kwon
Department of Internal Medicine, Chung-Ang University, Seoul, Republic of Korea
H
Hong Yu
Center for Healthcare Organization and Implementation Research, VA Bedford Health Care; Manning College of Information and Computer Sciences, UMass Amherst, MA, USA; Miner School of Computer and Information Sciences, UMass Lowell, MA, USA; Department of Medicine, UMass Chan Medical School, Worcester, MA, USA