TrustRAG: An Information Assistant with Retrieval Augmented Generation

📅 2025-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low credibility of outputs from retrieval-augmented generation (RAG) systems, this paper proposes a three-tier trust-enhancement framework: (1) hierarchical semantic chunking and indexing to preserve contextual integrity; (2) utility-aware retrieval filtering to dynamically prune irrelevant content; and (3) sentence-level claim–citation alignment for fine-grained provenance tracing. The method integrates semantic chunking, a utility-scoring model, and citation reasoning techniques, and is implemented within the open-source RAG Studio platform. Evaluated across multiple benchmarks, it achieves a 23.6% improvement in citation precision and reduces average input length by 37%, significantly enhancing answer accuracy and citation reliability. This work establishes the first systematic, trust-oriented RAG enhancement paradigm—enabling explainable, verifiable, and customizable RAG deployment.

Technology Category

Application Category

📝 Abstract
Ac{RAG} has emerged as a crucial technique for enhancing large models with real-time and domain-specific knowledge. While numerous improvements and open-source tools have been proposed to refine the ac{RAG} framework for accuracy, relatively little attention has been given to improving the trustworthiness of generated results. To address this gap, we introduce TrustRAG, a novel framework that enhances ac{RAG} from three perspectives: indexing, retrieval, and generation. Specifically, in the indexing stage, we propose a semantic-enhanced chunking strategy that incorporates hierarchical indexing to supplement each chunk with contextual information, ensuring semantic completeness. In the retrieval stage, we introduce a utility-based filtering mechanism to identify high-quality information, supporting answer generation while reducing input length. In the generation stage, we propose fine-grained citation enhancement, which detects opinion-bearing sentences in responses and infers citation relationships at the sentence-level, thereby improving citation accuracy. We open-source the TrustRAG framework and provide a demonstration studio designed for excerpt-based question answering tasks footnote{https://huggingface.co/spaces/golaxy/TrustRAG}. Based on these, we aim to help researchers: 1) systematically enhancing the trustworthiness of ac{RAG} systems and (2) developing their own ac{RAG} systems with more reliable outputs.
Problem

Research questions and friction points this paper is trying to address.

Enhancing trustworthiness in RAG systems
Improving semantic completeness in indexing
Ensuring citation accuracy in generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-enhanced hierarchical indexing
Utility-based information filtering
Fine-grained citation enhancement
🔎 Similar Papers
No similar papers found.