Factuality and Transparency Are All RAG Needs! Self-Explaining Contrastive Evidence Re-ranking

📅 2025-12-04

📈 Citations: 0

✨ Influential: 0

career value

149K/year

🤖 AI Summary

This paper addresses critical limitations of Retrieval-Augmented Generation (RAG) systems in safety-critical domains—namely, low factual consistency, poor interpretability, and frequent hallucinations. To this end, we propose a self-explanatory contrastive evidence re-ranking method. Our approach introduces two key innovations: (1) a token-level attribution-driven contrastive learning framework that explicitly distinguishes factual from misleading evidence via subjectivity-aware hard negative sampling; and (2) joint fine-tuning of the embedding space and generation of token-level attribution rationales to align retrieved results with evidence reasoning processes. Evaluated on a clinical trial report dataset, our method achieves significant improvements: +12.7% in retrieval accuracy and −38.4% reduction in hallucination rate. Crucially, it provides traceable, verifiable attribution grounds for each inference step. This work establishes a novel, interpretable, and robust retrieval paradigm for high-assurance RAG systems.

Technology Category

Application Category

📝 Abstract

This extended abstract introduces Self-Explaining Contrastive Evidence Re-Ranking (CER), a novel method that restructures retrieval around factual evidence by fine-tuning embeddings with contrastive learning and generating token-level attribution rationales for each retrieved passage. Hard negatives are automatically selected using a subjectivity-based criterion, forcing the model to pull factual rationales closer while pushing subjective or misleading explanations apart. As a result, the method creates an embedding space explicitly aligned with evidential reasoning. We evaluated our method on clinical trial reports, and initial experimental results show that CER improves retrieval accuracy, mitigates the potential for hallucinations in RAG systems, and provides transparent, evidence-based retrieval that enhances reliability, especially in safety-critical domains.

Problem

Research questions and friction points this paper is trying to address.

Improves retrieval accuracy in RAG systems

Mitigates hallucinations through factual evidence alignment

Provides transparent evidence-based retrieval for reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive learning fine-tunes embeddings for factual evidence

Automatic hard negative selection uses subjectivity-based criterion

Generates token-level attribution rationales for transparent retrieval

🔎 Similar Papers

ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation