Towards Omni-RAG: Comprehensive Retrieval-Augmented Generation for Large Language Models in Medical Applications

📅 2025-01-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical large language models (LLMs) suffer from hallucinations due to incomplete knowledge understanding. Method: This paper proposes the Source Planning Optimization (SPO) framework, which constructs MedOmniKB—a multi-source, heterogeneous medical knowledge base—and formalizes multi-source retrieval as an explicit source planning task. SPO innovatively decouples expert-model-based planning exploration from lightweight student-model-based alignment learning, employing positive/negative sample-driven planning alignment training to mitigate planning failure caused by source-content semantic mismatch. The method integrates heterogeneous multi-source knowledge, models source-selection logic, and enables expert-student co-optimization. Results: Experiments demonstrate that the lightweight student model achieves state-of-the-art performance across multiple medical RAG benchmarks, significantly suppressing hallucinations and enhancing the accuracy and reliability of clinical reasoning.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) hold promise for addressing healthcare challenges but often generate hallucinations due to limited integration of medical knowledge. Incorporating external medical knowledge is therefore critical, especially considering the breadth and complexity of medical content, which necessitates effective multi-source knowledge acquisition. We address this challenge by framing it as a source planning problem, where the task is to formulate context-appropriate queries tailored to the attributes of diverse knowledge sources. Existing approaches either overlook source planning or fail to achieve it effectively due to misalignment between the model's expectation of the sources and their actual content. To bridge this gap, we present MedOmniKB, a comprehensive repository comprising multigenre and multi-structured medical knowledge sources. Leveraging these sources, we propose the Source Planning Optimisation (SPO) method, which enhances multi-source utilisation through explicit planning optimisation. Our approach involves enabling an expert model to explore and evaluate potential plans while training a smaller model to learn source alignment using positive and negative planning samples. Experimental results demonstrate that our method substantially improves multi-source planning performance, enabling the optimised small model to achieve state-of-the-art results in leveraging diverse medical knowledge sources.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Medical Knowledge Integration
Healthcare Applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

MedOmniKB
SPO method
Medical Knowledge Alignment
🔎 Similar Papers
No similar papers found.
Z
Zhe Chen
Shanghai JiaoTong University, Shanghai Artificial Intelligence Laboratory
Yusheng Liao
Yusheng Liao
Shanghai Jiao Tong University, Shanghai Artificial Intelligence Laboratory
Large Language ModelsClinical NLPAgentReasoning
Shuyang Jiang
Shuyang Jiang
Shanghai AI Lab, Research Intern
Natural language processingMachine learning
Pingjie Wang
Pingjie Wang
Shanghai Jiao Tong University
Model CompressionInference Acceleration
Y
Yiqiu Guo
Fudan University, Shanghai Artificial Intelligence Laboratory
Yanfeng Wang
Yanfeng Wang
Shanghai Jiao Tong University
Y
Yu Wang
Shanghai JiaoTong University, Shanghai Artificial Intelligence Laboratory