YpathRAG:A Retrieval-Augmented Generation Framework and Benchmark for Pathology

📅 2025-10-07

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

To address hallucination issues in large language models (LLMs) within high-barrier domains such as pathology, this paper proposes YpathRAG—the first retrieval-augmented generation (RAG) framework tailored for pathology. Methodologically, it constructs a comprehensive pathology vector database comprising 1.53 million paragraphs across 28 subspecialties, employs a dual-channel hybrid retrieval strategy integrating BGE-M3 dense retrieval with lexicon-guided sparse retrieval, and introduces an LLM-driven supportive evidence discrimination module to enable closed-loop optimization of retrieval, verification, and generation. Key contributions include: (1) releasing YpathR and YpathQA-M—two pathology-specific evaluation benchmarks; (2) achieving Recall@5 of 98.64%, outperforming baseline methods by 23 percentage points; and (3) improving average accuracy of general-purpose and medical LLMs by 9.0% on YpathQA-M, with maximum gains up to 15.6%, thereby substantially enhancing factual accuracy and interpretability.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) excel on general tasks yet still hallucinate in high-barrier domains such as pathology. Prior work often relies on domain fine-tuning, which neither expands the knowledge boundary nor enforces evidence-grounded constraints. We therefore build a pathology vector database covering 28 subfields and 1.53 million paragraphs, and present YpathRAG, a pathology-oriented RAG framework with dual-channel hybrid retrieval (BGE-M3 dense retrieval coupled with vocabulary-guided sparse retrieval) and an LLM-based supportive-evidence judgment module that closes the retrieval-judgment-generation loop. We also release two evaluation benchmarks, YpathR and YpathQA-M. On YpathR, YpathRAG attains Recall@5 of 98.64%, a gain of 23 percentage points over the baseline; on YpathQA-M, a set of the 300 most challenging questions, it increases the accuracies of both general and medical LLMs by 9.0% on average and up to 15.6%. These results demonstrate improved retrieval quality and factual reliability, providing a scalable construction paradigm and interpretable evaluation for pathology-oriented RAG.

Problem

Research questions and friction points this paper is trying to address.

Addressing LLM hallucinations in pathology through evidence-grounded RAG framework

Improving retrieval quality in medical domains with dual-channel hybrid retrieval

Enhancing factual reliability of pathology QA via supportive-evidence judgment module

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-channel hybrid retrieval with dense and sparse methods

LLM-based supportive-evidence judgment module for verification

Pathology vector database covering 28 subfields and 1.53M paragraphs

🔎 Similar Papers

No similar papers found.