MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data

📅 2023-04-14
📈 Citations: 246
✹ Influential: 39
📄 PDF
đŸ€– AI Summary
To address privacy preservation and on-premises deployment requirements in healthcare, this work introduces the first large-scale open-source medical dialogue dataset (160K+ high-quality samples) and releases a corresponding family of fine-tuned models. Methodologically, we build upon the LLaMA/Alpaca architecture, employing supervised fine-tuning augmented with medical instruction refinement—leveraging clinical guidelines, authoritative textbooks, and real physician licensure examination questions (e.g., USMLE) to construct structured, clinically grounded dialogues. Our contributions are threefold: (1) bridging the gap in privacy-sensitive, offline-deployable open-weight LLMs for medical applications; (2) establishing a standardized, exam-aligned evaluation benchmark targeting physician competency assessment; and (3) achieving >35% improvement in reasoning accuracy over base models on simulated clinical evaluations, with preliminary HIPAA compliance verification—enabling secure, controllable deployment in clinical decision support, medical education, and diagnostic assistance.
📝 Abstract
As large language models (LLMs) like OpenAI's GPT series continue to make strides, we witness the emergence of artificial intelligence applications in an ever-expanding range of fields. In medicine, these LLMs hold considerable promise for improving medical workflows, diagnostics, patient care, and education. Yet, there is an urgent need for open-source models that can be deployed on-premises to safeguard patient privacy. In our work, we present an innovative dataset consisting of over 160,000 entries, specifically crafted to fine-tune LLMs for effective medical applications. We investigate the impact of fine-tuning these datasets on publicly accessible pre-trained LLMs, and subsequently, we juxtapose the performance of pre-trained-only models against the fine-tuned models concerning the examinations that future medical doctors must pass to achieve certification.
Problem

Research questions and friction points this paper is trying to address.

Develop open-source medical conversational AI models
Ensure patient privacy with on-premises deployment
Evaluate fine-tuned models on medical certification exams
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source medical conversational AI models
Fine-tuning with 160,000 medical entries
Comparison of pre-trained vs fine-tuned models
🔎 Similar Papers
No similar papers found.
T
T. Han
Department of Radiology, University Hospital Aachen, Aachen, Germany
L
Lisa C. Adams
Department of Diagnostic and Interventional Radiology, Technical University of Munich, Munich, Germany
J
Jens-Michalis Papaioannou
Berliner Hochschule fĂŒr Technik (BHT), Berlin, Germany
Paul Grundmann
Paul Grundmann
Berliner Hochschule fĂŒr Technik
Natural Language Processing
Tom Oberhauser
Tom Oberhauser
Berliner Hochschule fĂŒr Technik (BHT), Berlin, Germany
Alexander Löser
Alexander Löser
Professor of Data Science and Text-based Information Systems
Clinical Text MiningMachine ReadingLanguage Technology
D
D. Truhn
Department of Radiology, University Hospital Aachen, Aachen, Germany
K
K. Bressem
CharitĂ© – UniversitĂ€tsmedizin Berlin, corporate member of Freie UniversitĂ€t Berlin and Humboldt-UniversitĂ€t zu Berlin, Institute for Radiology, Berlin, Germany; Berlin Institute of Health at CharitĂ© – UniversitĂ€tsmedizin Berlin, Germany