🤖 AI Summary
To address the low credibility of outputs from retrieval-augmented generation (RAG) systems, this paper proposes a three-tier trust-enhancement framework: (1) hierarchical semantic chunking and indexing to preserve contextual integrity; (2) utility-aware retrieval filtering to dynamically prune irrelevant content; and (3) sentence-level claim–citation alignment for fine-grained provenance tracing. The method integrates semantic chunking, a utility-scoring model, and citation reasoning techniques, and is implemented within the open-source RAG Studio platform. Evaluated across multiple benchmarks, it achieves a 23.6% improvement in citation precision and reduces average input length by 37%, significantly enhancing answer accuracy and citation reliability. This work establishes the first systematic, trust-oriented RAG enhancement paradigm—enabling explainable, verifiable, and customizable RAG deployment.
📝 Abstract
Ac{RAG} has emerged as a crucial technique for enhancing large models with real-time and domain-specific knowledge. While numerous improvements and open-source tools have been proposed to refine the ac{RAG} framework for accuracy, relatively little attention has been given to improving the trustworthiness of generated results. To address this gap, we introduce TrustRAG, a novel framework that enhances ac{RAG} from three perspectives: indexing, retrieval, and generation. Specifically, in the indexing stage, we propose a semantic-enhanced chunking strategy that incorporates hierarchical indexing to supplement each chunk with contextual information, ensuring semantic completeness. In the retrieval stage, we introduce a utility-based filtering mechanism to identify high-quality information, supporting answer generation while reducing input length. In the generation stage, we propose fine-grained citation enhancement, which detects opinion-bearing sentences in responses and infers citation relationships at the sentence-level, thereby improving citation accuracy. We open-source the TrustRAG framework and provide a demonstration studio designed for excerpt-based question answering tasks footnote{https://huggingface.co/spaces/golaxy/TrustRAG}. Based on these, we aim to help researchers: 1) systematically enhancing the trustworthiness of ac{RAG} systems and (2) developing their own ac{RAG} systems with more reliable outputs.