RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects

📅 2025-01-30

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

To address the unreliability of RAG systems caused by noisy, irrelevant, or counterfactual information introduced due to retriever or knowledge base deficiencies, this paper proposes Robust fine-Tuning (RbFT), a dual-task robust adaptation paradigm. RbFT is the first method to explicitly model retrieval failures as controllable training signals—without modifying the retriever, incurring no inference overhead, and remaining fully compatible with LoRA and mainstream RAG architectures. It employs two-stage supervised fine-tuning: (i) misretrieval-augmented response correction and (ii) retrieval-quality-aware confidence calibration. This significantly enhances large language models’ robustness under unreliable retrieval inputs. Evaluated across diverse synthetic and real-world defective retrieval scenarios, RbFT achieves an average 12.7% improvement in question-answering accuracy over strong baselines, while preserving millisecond-level inference latency.

Technology Category

Application Category

📝 Abstract

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by integrating external knowledge retrieved from a knowledge base. However, its effectiveness is fundamentally constrained by the reliability of both the retriever and the knowledge base. In real-world scenarios, imperfections in these components often lead to the retrieval of noisy, irrelevant, or misleading counterfactual information, ultimately undermining the trustworthiness of RAG systems. To address this challenge, we propose Robust Fine-Tuning (RbFT), a method designed to enhance the resilience of LLMs against retrieval defects through two targeted fine-tuning tasks. Experimental results demonstrate that RbFT significantly improves the robustness of RAG systems across diverse retrieval conditions, surpassing existing methods while maintaining high inference efficiency and compatibility with other robustness techniques.

Problem

Research questions and friction points this paper is trying to address.

RAG systems

inaccurate knowledge bases

reliable content generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

RbFT

misleading information resilience

performance enhancement

🔎 Similar Papers

No similar papers found.