KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model

📅 2025-02-27

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

This work addresses the challenge of interpretable drug recommendation. We propose KEDRec-LM, a large language model (LLM) trained via knowledge distillation–driven instruction fine-tuning, and introduce expRxRec—the first publicly available, multi-source heterogeneous dataset integrating knowledge graphs, clinical trial records, and PubMed literature. Methodologically, we pioneer the integration of knowledge distillation with instruction tuning, jointly leveraging drug graph embeddings, clinical text encodings, and PubMed semantic alignment to jointly generate accurate drug recommendations and natural-language medical explanations. Experimental results demonstrate that KEDRec-LM significantly outperforms existing baselines in both recommendation accuracy and rationale plausibility. Both the expRxRec dataset and the KEDRec-LM model are fully open-sourced, establishing a new benchmark and practical toolkit for interpretable biomedical AI research.

Technology Category

Application Category

📝 Abstract

Drug discovery is a critical task in biomedical natural language processing (NLP), yet explainable drug discovery remains underexplored. Meanwhile, large language models (LLMs) have shown remarkable abilities in natural language understanding and generation. Leveraging LLMs for explainable drug discovery has the potential to improve downstream tasks and real-world applications. In this study, we utilize open-source drug knowledge graphs, clinical trial data, and PubMed publications to construct a comprehensive dataset for the explainable drug discovery task, named extbf{expRxRec}. Furthermore, we introduce extbf{KEDRec-LM}, an instruction-tuned LLM which distills knowledge from rich medical knowledge corpus for drug recommendation and rationale generation. To encourage further research in this area, we will publicly releasefootnote{A copy is attached with this submission} both the dataset and KEDRec-LM.

Problem

Research questions and friction points this paper is trying to address.

Explainable drug discovery using LLMs

Construction of comprehensive drug dataset

Knowledge-distilled LLM for drug recommendation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge-distilled LLM for drug recommendation

Instruction-tuned model for rationale generation

Open-source dataset from medical knowledge graphs

🔎 Similar Papers

Large Language Model Distilling Medication Recommendation Model