YpathRAG:A Retrieval-Augmented Generation Framework and Benchmark for Pathology

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address hallucination issues in large language models (LLMs) within high-barrier domains such as pathology, this paper proposes YpathRAG—the first retrieval-augmented generation (RAG) framework tailored for pathology. Methodologically, it constructs a comprehensive pathology vector database comprising 1.53 million paragraphs across 28 subspecialties, employs a dual-channel hybrid retrieval strategy integrating BGE-M3 dense retrieval with lexicon-guided sparse retrieval, and introduces an LLM-driven supportive evidence discrimination module to enable closed-loop optimization of retrieval, verification, and generation. Key contributions include: (1) releasing YpathR and YpathQA-M—two pathology-specific evaluation benchmarks; (2) achieving Recall@5 of 98.64%, outperforming baseline methods by 23 percentage points; and (3) improving average accuracy of general-purpose and medical LLMs by 9.0% on YpathQA-M, with maximum gains up to 15.6%, thereby substantially enhancing factual accuracy and interpretability.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) excel on general tasks yet still hallucinate in high-barrier domains such as pathology. Prior work often relies on domain fine-tuning, which neither expands the knowledge boundary nor enforces evidence-grounded constraints. We therefore build a pathology vector database covering 28 subfields and 1.53 million paragraphs, and present YpathRAG, a pathology-oriented RAG framework with dual-channel hybrid retrieval (BGE-M3 dense retrieval coupled with vocabulary-guided sparse retrieval) and an LLM-based supportive-evidence judgment module that closes the retrieval-judgment-generation loop. We also release two evaluation benchmarks, YpathR and YpathQA-M. On YpathR, YpathRAG attains Recall@5 of 98.64%, a gain of 23 percentage points over the baseline; on YpathQA-M, a set of the 300 most challenging questions, it increases the accuracies of both general and medical LLMs by 9.0% on average and up to 15.6%. These results demonstrate improved retrieval quality and factual reliability, providing a scalable construction paradigm and interpretable evaluation for pathology-oriented RAG.
Problem

Research questions and friction points this paper is trying to address.

Addressing LLM hallucinations in pathology through evidence-grounded RAG framework
Improving retrieval quality in medical domains with dual-channel hybrid retrieval
Enhancing factual reliability of pathology QA via supportive-evidence judgment module
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-channel hybrid retrieval with dense and sparse methods
LLM-based supportive-evidence judgment module for verification
Pathology vector database covering 28 subfields and 1.53M paragraphs
🔎 Similar Papers
No similar papers found.
Deshui Yu
Deshui Yu
Tsinghua University Shenzhen International Graduate School, China
Y
Yizhi Wang
Tsinghua University Shenzhen International Graduate School, China
S
Saihui Jin
China Unicom Guangdong Branch, China
T
Taojie Zhu
Tsinghua University Shenzhen International Graduate School, China
F
Fanyi Zeng
Tsinghua University Shenzhen International Graduate School, China
W
Wen Qian
Tsinghua University Shenzhen International Graduate School, China
Z
Zirui Huang
Tsinghua University Shenzhen International Graduate School, China
J
Jingli Ouyang
Tsinghua University Shenzhen International Graduate School, China
Jiameng Li
Jiameng Li
KU Leuven
Computer VisionDeep Learning
Zhen Song
Zhen Song
Siemens Corporation, Corporate Technology
Building automationbuilding enegy managmentoptimal controlroboticsoptimization
T
Tian Guan
Tsinghua University Shenzhen International Graduate School, China
Yonghong He
Yonghong He
清华大学深圳国际研究生院
生物医学工程,光学成像,AI图像处理、病理大模型