DrugRAG: Enhancing Pharmacy LLM Performance Through A Novel Retrieval-Augmented Generation Pipeline

📅 2025-12-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the insufficient accuracy of large language models (LLMs) on pharmacist licensure examination–style domain-specific question answering. We propose a lightweight, model-agnostic external retrieval-augmented generation (RAG) framework that requires no architectural modification or fine-tuning. Our method employs a three-stage semantic retrieval process over a structured pharmacological knowledge base to dynamically retrieve evidential passages, integrated with evidence-driven contextual prompt engineering for plug-and-play injection of authoritative knowledge. Key contributions include: (i) the first general-purpose RAG interface specifically designed for pharmacology, enabling seamless cross-model adaptation; and (ii) fully externalized integration, substantially narrowing the performance gap between small and large models. Evaluated on a 141-question pharmacology QA benchmark, our approach improves accuracy across 11 LLMs by 7–21 percentage points (e.g., Llama 3.1 8B achieves 67%), enabling compact models to approach the performance of top-tier proprietary models.

Technology Category

Application Category

📝 Abstract
Objectives: To evaluate large language model (LLM) performance on pharmacy licensure-style question-answering (QA) tasks and develop an external knowledge integration method to improve their accuracy. Methods: We benchmarked eleven existing LLMs with varying parameter sizes (8 billion to 70+ billion) using a 141-question pharmacy dataset. We measured baseline accuracy for each model without modification. We then developed a three-step retrieval-augmented generation (RAG) pipeline, DrugRAG, that retrieves structured drug knowledge from validated sources and augments model prompts with evidence-based context. This pipeline operates externally to the models, requiring no changes to model architecture or parameters. Results: Baseline accuracy ranged from 46% to 92%, with GPT-5 (92%) and o3 (89%) achieving the highest scores. Models with fewer than 8 billion parameters scored below 50%. DrugRAG improved accuracy across all tested models, with gains ranging from 7 to 21 percentage points (e.g., Gemma 3 27B: 61% to 71%, Llama 3.1 8B: 46% to 67%) on the 141-item benchmark. Conclusion: We demonstrate that external structured drug knowledge integration through DrugRAG measurably improves LLM accuracy on pharmacy tasks without modifying the underlying models. This approach provides a practical pipeline for enhancing pharmacy-focused AI applications with evidence-based information.
Problem

Research questions and friction points this paper is trying to address.

Improves LLM accuracy on pharmacy exam questions
Integrates external drug knowledge without model changes
Provides evidence-based context for pharmacy AI applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

DrugRAG uses a three-step retrieval-augmented generation pipeline
It retrieves structured drug knowledge from validated external sources
The pipeline augments prompts with evidence-based context externally
🔎 Similar Papers
No similar papers found.
H
Houman Kazemzadeh
Department of Medicinal Chemistry, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran
K
Kiarash Mokhtari Dizaji
Department of Computer Science s, Faculty of Mathematics and Computer Science s, Amir Kabir University of Technology, Tehran, Iran
S
Seyed Reza Tavakoli
Department of Mathematical Science s, Sharif University of Technology, Tehran, Iran
F
Farbod Davoodi
Department of Computer Science, Missouri University of Science and Technology, Rolla, MO, USA
M
MohammadReza KarimiNejad
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
P
Parham Abed Azad
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
A
Ali Sabzi
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
A
Armin Khosravi
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
S
Siavash Ahmadi
Electronics Research Institute, Sharif University of Technology, Tehran, Iran
Mohammad Hossein Rohban
Mohammad Hossein Rohban
Associate Professor in Computer Engineering, Sharif University of Technology
Machine LearningStatisticsComputational Biology
G
Glolamali Aminian
The Alan Turing Institute, London, United Kingdom
T
Tahereh Javaheri
Health Informatics Lab, Metropolitan College, Boston University, Boston, USA