Human or LLM as Standardized Patients? A Comparative Study for Medical Education

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

Traditional standardized patients (SPs) suffer from high costs, limited scalability, and inflexibility; existing LLM-based SP simulators exhibit inconsistent behavioral outputs and lack rigorous empirical validation against human SPs. To address these limitations, we propose EasyMED, a multi-agent SP simulation framework comprising patient, assistant, and evaluator agents. It incorporates a dialogue consistency maintenance mechanism and a structured assessment methodology, and introduces SPBench—the first benchmark specifically designed for SP-oriented evaluation. Experimental evaluation across 14 medical specialties demonstrates that EasyMED achieves pedagogical efficacy comparable to human SPs, significantly enhances clinical skill acquisition among learners with low baseline proficiency, and offers superior flexibility, psychological safety, and cost-effectiveness.

Technology Category

Application Category

📝 Abstract

Standardized Patients (SP) are indispensable for clinical skills training but remain expensive, inflexible, and difficult to scale. Existing large-language-model (LLM)-based SP simulators promise lower cost yet show inconsistent behavior and lack rigorous comparison with human SP. We present EasyMED, a multi-agent framework combining a Patient Agent for realistic dialogue, an Auxiliary Agent for factual consistency, and an Evaluation Agent that delivers actionable feedback. To support systematic assessment, we introduce SPBench, a benchmark of real SP-doctor interactions spanning 14 specialties and eight expert-defined evaluation criteria. Experiments demonstrate that EasyMED matches human SP learning outcomes while producing greater skill gains for lower-baseline students and offering improved flexibility, psychological safety, and cost efficiency.

Problem

Research questions and friction points this paper is trying to address.

Comparing human versus LLM standardized patients for medical training

Addressing inconsistent behavior and scalability in SP simulators

Developing multi-agent framework to match human SP outcomes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework simulating patient dialogue and feedback

Benchmark with real interactions across 14 medical specialties

Matches human outcomes while improving cost and flexibility

🔎 Similar Papers

No similar papers found.