Retrieval-Augmented Generation in Medicine: A Scoping Review of Technical Implementations, Clinical Applications, and Ethical Considerations

📅 2025-11-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical RAG faces critical challenges including underutilization of private clinical data, weak multilingual support—particularly for non-English languages—poor adaptability in low-resource settings, and insufficient evaluation of safety and bias. This study presents the first systematic review of RAG architectures, clinical applications (e.g., question answering, report generation, summarization, information extraction), and ethical risks in medicine. It reveals that current practice overrelies on public datasets, English-centric embedding models, and general-purpose LLMs. Methodologically, we propose a novel cross-lingual adaptation framework and a clinician-in-the-loop evaluation paradigm grounded in real-world clinical validation. Our contributions include advocating for domain-specific medical LLMs and implementing multidimensional human–automated hybrid evaluation—explicitly incorporating safety and fairness dimensions. These advances establish both theoretical foundations and actionable pathways toward building trustworthy, accessible, and responsible global medical RAG systems.

Technology Category

Application Category

📝 Abstract
The rapid growth of medical knowledge and increasing complexity of clinical practice pose challenges. In this context, large language models (LLMs) have demonstrated value; however, inherent limitations remain. Retrieval-augmented generation (RAG) technologies show potential to enhance their clinical applicability. This study reviewed RAG applications in medicine. We found that research primarily relied on publicly available data, with limited application in private data. For retrieval, approaches commonly relied on English-centric embedding models, while LLMs were mostly generic, with limited use of medical-specific LLMs. For evaluation, automated metrics evaluated generation quality and task performance, whereas human evaluation focused on accuracy, completeness, relevance, and fluency, with insufficient attention to bias and safety. RAG applications were concentrated on question answering, report generation, text summarization, and information extraction. Overall, medical RAG remains at an early stage, requiring advances in clinical validation, cross-linguistic adaptation, and support for low-resource settings to enable trustworthy and responsible global use.
Problem

Research questions and friction points this paper is trying to address.

Reviewing RAG technical implementations and clinical applications in medicine
Addressing limitations of large language models for medical knowledge retrieval
Evaluating RAG systems for clinical validation and ethical considerations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using retrieval-augmented generation for medical applications
Employing English-centric embedding models for information retrieval
Applying automated metrics and human evaluation for assessment
🔎 Similar Papers
No similar papers found.
R
Rui Yang
Center for Quantitative Medicine, Duke–NUS Medical School, Singapore 169857, Singapore
M
Matthew Yu Heng Wong
School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SP, UK
Huitao Li
Huitao Li
Duke-Nus Medical School
Medical Informatics
X
Xin Li
Center for Quantitative Medicine, Duke–NUS Medical School, Singapore 169857, Singapore
W
Wentao Zhu
Center for Quantitative Medicine, Duke–NUS Medical School, Singapore 169857, Singapore
J
Jingchi Liao
Center for Quantitative Medicine, Duke–NUS Medical School, Singapore 169857, Singapore
K
Kunyu Yu
Center for Quantitative Medicine, Duke–NUS Medical School, Singapore 169857, Singapore
J
Jonathan Chong Kai Liew
Center for Quantitative Medicine, Duke–NUS Medical School, Singapore 169857, Singapore
Weihao Xuan
Weihao Xuan
The University of Tokyo, RIKEN
Natural Language ProcessingComputer VisionMultimodal AIGenerative AILLM Agent
Y
Yingjian Chen
Graduate School of Engineering, The University of Tokyo, Tokyo 113-8654, Japan
Y
Yuhe Ke
Division of Anesthesiology and Perioperative Medicine, Singapore General Hospital, Singapore 169608, Singapore
J
J. Ong
Division of Pharmacy, Singapore General Hospital, Singapore 169608, Singapore
Douglas Teodoro
Douglas Teodoro
Professor, University of Geneva
biomedical NLPmachine learning for healthcaremedical informatics
C
Chuan Hong
Department of Biostatistics and Bioinformatics, Duke School of Medicine, Durham, NC 27710, USA
D
D. Ting
Singapore Eye Research Institute, Singapore National Eye Center, Singapore 168751, Singapore
N
Nan Liu
Center for Quantitative Medicine, Duke–NUS Medical School, Singapore 169857, Singapore